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ABSTRACT 



The outcomes and impacts of adult literacy education in the 
United States were examined through a qualitative assessment of the pertinent 
research conducted since the late 1960s. A comprehensive literature search 
identified approximately 115 outcomes and impacts studies. Of the 68 studies 
found to have an outcomes component, the 23 most credible ones were selected 
and case studies were prepared for each. It was concluded that participation 
in adult literacy education most likely results in employment and earnings 
gains and has a positive influence on participants' continued education. 
Although the evidence suggested that participants in welfare -sponsored adult 
literacy education do experience a reduction in welfare dependence, the 
evidence as to whether adult literacy education in general reduces welfare 
dependence for participants was inconclusive. In general, adult literacy 
education had positive impacts on high school equivalency certificate 
acquisition, participants' self-image, parents' involvement in their 
children's education, and learners' achievement of their personal goals. It 
was recommended that a system consisting of the following elements be 
developed to measure the outcomes and impacts of adult literacy: national 
outcome and impact reporting system, national longitudinal evaluation, and 
systematic funding and improvement of state and local outcome studies . 
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EXECUTIVE SUMMARY 

This study investigated the outcomes and impacts of adult literacy 
education through a qualitative assessment of the outcomes and impacts research 
conducted since the late 1960s. Outcomes are the changes in learners that occur as 
a result of their participation in adult literacy education. Impacts are the changes 
that occur in the family, community, and larger society as a consequence of 
participation. 

The goals of the study were to make reasoned inferences about the 
effectiveness of adult literacy education in the United States; to identify common 
conceptual, design, and methodological problems inherent in the outcome studies 
conducted; to raise and discuss issues for policy; and to make recommendations 
for the design and conduct of future outcomes studies. 

Through a comprehensive literature search, approximately 115 outcomes 
and impacts studies were identified. All were obtained in either hard copy or 
microfiche and the 68 of those that were found to have an outcomes component 
were abstracted. Each study was then evaluated according to the following 
criteria: the study included an outcome/impact component; the report was 
adequately documented with respect to design and methods; there was an 
adequate number of cases; the sampling plan was adequate; data collection 
procedures were adequate (i.e., were not tainted by substantial attrition or biased 
by other factors); objective measures, rather than self-report, were used to 
measure outcomes; measures, especially tests, were valid and reliable; the 
research design included a control of comparison group; and inferences logically 
followed from the design and data. 

Based on this evaluation, 23 studies were selected as being the most 
credible, and case studies were prepared for each. Studies are presented in five 
categories: national, state-level, welfare, family literacy, and workplace literacy. 
From the 23 studies, inferences about program effectiveness are made. 

Program Effectiveness 

The 23 case studies represented evidence rather than proof of impact, and, 
like evidence in a trial, their findings were weighed in order to reach reasonable 
conclusions. Weighing had two dimensions. The first was the extent to which the 
various studies converged or diverged in respect to their findings on specific 
outcome/impact variables. Consensus across studies pointed toward 
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effectiveness/ineffectiveness, while lack of consensus suggested an inconclusive 
resolution. The second dimension was the credibility of the individual studies. 
When arriving at conclusions, more credible studies were weighed more heavily 
than less credible studies. 

The conclusions set forth are deemed to be reasonable inferences from the 
findings reported in the case studies. They do not represent proof Indeed, it is 
unlikely that any conceivable study or studies could arrive at certainty. Table 1 
presents the data used for this analysis. 

In interpreting Table 1 and the conclusions made from it, three caveats are 
in order. First, the variables included are those studied by a sufficiently large 
number of studies to enable reasonable conclusions. However, variable definitions 
and their units of measure vary among studies. In some studies, for example, 
learning gain is measured by the CASAS, while in others the TALS or TABE are 
used. Second, if a given study reported a gain, the gain is listed as positive (y) in 
the table irrespective of the size of the gain or the quality of the study’s 
methodology. In some cases the gains reported as positive are quite small, and in 
some cases the limitations of the study render claims of gains suspect. Third, the 
totals are aggregates of studies conducted at different times and on different 
populations of adult literacy learners, welfare clients and employees being 
examples. Drawing conclusions from such aggregates presumes that doing so is 
both valid and meaningful. 

From the case studies, as summarized in Table 1, the following 
conclusions were made about the effectiveness of the adult literacy education 
program in the United States: 

1 . In general, it is likely that participants in adult literacy education receive gains 
in employment. 

2. In general, participants in adult literacy education believe their jobs improve 
over time. However, there is insufficient evidence to conclude that 
participation in adult literacy education causes job improvement. 

3. In general, it is likely that participation in adult literacy education results in 
earnings gain. 

4. In general, adult literacy education has a positive influence on participants’ 
continued education. 

5. Although the evidence suggests that participants in welfare-sponsored (e.g. 
JOBS Program) adult literacy education do experience a reduction in welfare 
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dependence, the evidence is inconclusive as to whether adult literacy 
education in general reduces welfare dependence for participants. 

6. Learners perceive that participation in adult literacy education improves their 
skills in reading, writing, and mathematics. 

7. As measured by tests, the evidence is insufficient to determine whether or not 
participants in adult literacy education gain in basic skills. 

8. In general, adult literacy education provides gains in GED acquisition for 
participants entering at the adult secondary (ASE) level. 

9. Participation in adult literacy has a positive impact on learners’ self-image. 

10. According to learners’ self-reports, participation in adult literacy education 
has a positive impact on parents’ involvement in their children’s education. 

1 1 . Learners perceive that their personal goals are achieved through participation 
in adult literacy education. 

In the final chapter, conceptual, design, and methodological problems inherent 

in the studies are discussed and implications for policy are presented, including 

recommendations for: 

1 . relevant and measurable outcome standards and a feasible impact reporting 
system; 

2. a comprehensive national longitudinal evaluation measuring long-term 
impact; and 

3. systematic funding and improvement of state and local outcome studies. 
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PREFACE 

This report has four chapters. The first, the Introduction, frames the issues 
that provide focus to the report. The second describes the study’s methodology. 
The third chapter, which is by far the longest, presents case studies of 23 studies 
of outcomes and impacts. The final chapter presents conclusions, implications, 
and recommendations. 

The case studies presented in Chapter 3 represent the data upon which the 
study’s conclusions are based. While from a researcher’s perspective it is critical 
that they be included in the report, it is recognized that this material may be of 
more interest to researchers than to many policy makers and practitioners. For 
those who are less interested in the case studies, it is recommended that Chapter 3 
be read last. 
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THE OUTCOMES AND IMPACTS OF ADULT LITERACY EDUCATION 

IN THE UNITED STATES 



CHAPTER 1: INTRODUCTION 



FRAMING THE ISSUE 

This paper examines the outcomes and impacts of adult literacy education 
in the United States. Before this examination can begin in earnest, however, it is 
important to frame the issue. What are outcomes and impacts? Why is it 
important to examine them? 

Outcomes are the changes that take place in learners as a result of their 
participation in adult literacy education. Outcomes imply cause and effect — 
participation in adult literacy education is the cause; measurable changes in 
knowledge, skills, attitudes, and behavior are the effects. Impacts are the changes 
in the family, community, and society in general that result from participation in 
adult literacy education, and, as with outcomes, they too imply cause and effect. 

As Brizius and Campbell (1991) note, assessments based on outcomes and 
impacts are distinct from assessments based on program processes, inputs, and 
outputs. Assessing inputs, for example, often entails detailed descriptions of 
learners’ characteristics to determine if programs are serving adequate numbers of 
learners and the kinds of learners programs intend to serve. Similarly, an input 
assessment might examine teachers’ characteristics, such as their years of 
experience, their attitudes, or their levels of certification, in order to assess 
whether the teaching force is competent. An assessment of output analyzes the 
products of adult literacy education. Typical output variables include the number 
of learners served during a given period, how learners were instructed, and the 
program retention rate. 

Many of the evaluation studies that have been conducted in adult literacy 
education in the past 30 years have primarily, or exclusively, focused on inputs 
and outputs rather than on outcomes and impacts, and there are several reasons for 
this. First, for the most part, the regulations and standards that programs have 
been required to meet in order to acquire state and federal funding have been 
based on inputs and outputs. Thus programs routinely gather and report input and 
output data, although the accuracy of these data has sometimes been questioned 
(Condelli & Kumer, 1997). Secondly, input and output data are relatively easier 
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to collect than outcome and impact data. Although input data can be routinely 
collected during the learner intake process and output data can be collected from 
program records such as teachers’ attendance reports, gaining accurate and 
meaningful outcome and impact data generally requires special evaluation 
research, which requires resources local programs lack and states are reluctant to 
allocate. 

Although input and output assessments are often useful, clearly outcome 
and impact assessments represent the “bottom line” with respect to determining 
the effectiveness of adult literacy education because, even when inputs and 
outputs are adequate and impressive, there is no guarantee that learners have 
actually learned and that society has reaped the benefits policy makers expect. 
Credible measurement of the outcomes and impacts of adult literacy education is 
critically important for at least two reasons: program accountability and program 
planning and improvement. 



ACCOUNTABILITY 

As Merrifield (1998) argues in the companion piece to this report, which 
addresses accountability, in the past decade accountability has emerged as a 
critical concern of policy makers, particularly those policy makers who exercise 
control over resource allocations.' Indeed, accountability has been a concern for 
elementary and secondary education, as witnessed by the movement to develop 
accountability standards led, in part, by the National Council of Teachers of 
Mathematics. It has also been a major concern for higher education, public 
health, and social work. 

In a report that focuses on accountability and public policy, Brizius and 
Campbell (1991) define accountability as follows: 

Performance accountability is a means of judging policies and programs 
by measuring their outcomes or results against agreed upon standards. A 
performance accountability system provides the framework for measuring 
outcomes — not merely processes or workloads, (p. 5) 

As Brizius and Campbell suggest, accountability must be based on outcomes 
rather than on a mere reporting of inputs, outputs, or descriptions of program 
processes. The reason is simple. For the purpose of accountability, an adequate 
judgment of a policy, program, or system of programs must be based on the cause 
and effect relationship between what the program or system does and the benefits 
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to individuals and society it produces. This relationship cannot be determined 
unless the actual benefits (that is, outcomes and impacts) to individuals and 
society are measured and are measured in a way that allows decision-makers to 
logically infer cause and effect. 

Any discussion of accountability raises the questions of accountability to 
whom and for what. In adult literacy education, these questions have by no means 
been resolved. Many claim that adult literacy education should be primarily 
accountable to learners since learners are the clients of adult literacy education. 
Further, it is argued that, since participation in adult literacy education is 
voluntary, unless learners are able to meet their own goals, they will be reluctant 
to participate, and if they do happen to participate, they will quickly drop out. 

Others claim that adult literacy education should be accountable to society 
in general since society in general finances the program through tax dollars. 

Given our political system. Congress is the surrogate for society in general at the 
national level, and state legislatures are the surrogates at the state level. From a 
practical point of view. Congress and state legislatures are the bodies that allocate 
resources for adult literacy. Since with the purse goes power, the will of these 
legislative bodies can hardly be ignored. Debate that revolved around the Careers 
Act of 1996, which was not passed, and HR 1385, which was passed by the House 
of Representatives in the spring of 1997, suggests that Congress is increasingly 
conceiving adult literacy education to be part of the nation s workforce readiness 
system. To this extent, accountability has increasingly been framed in human 
capital terms. Can adult literacy education be accountable to adult learners and to 
society as represented by Congress and state legislatures? If the answer is no, 
which locus of accountability should prevail? 

PROGRAM IMPROVEMENT AND PLANNING 

Outcome assessment is a critical tool for program planning and 
improvement, for, quite obviously, weaknesses cannot be corrected and strengths 
cannot be capitalized upon unless they are systematically identified. When the 
literature on outcome assessment is considered, however, it must be concluded 
that outcome assessment has not been a \videspread and systematic strategy for 
program planning and policy formation in adult literacy education. Since the 
inception of the Adult Education Act, there have been but a handful of credible 
state-level outcome assessments in adult literacy education, and, although the U.S. 
Department of Education has commissioned three national assessments of the 
adult literacy education system, these evaluations have focused primarily on input 
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and output description, and the outcome data in each of them were so seriously 
flawed that sound inferences have been impossible. Systematic evaluation at the 
local level is rare indeed. As this report will demonstrate, the relatively few 
formal outcome assessments that have been conducted at the state and national 
levels are of such questionable validity that they are nearly useless for program 
planning and policy formation. 

Demonstrating accountability and use of outcome assessment in program 
planning and improvement has been particularly difficult in adult literacy 
education for several reasons that will be discussed here. 

GOALS 

A classic evaluation begins with a delineation of program or system goals. 
Next, the goals are broken down into a series of objectives, and the objectives are 
“operationalized” into a set of variables for measurement. One of the problems 
inherent with outcome assessments in adult literacy education is that there is 
simply no consensus regarding what adult literacy is or what its goals should be. 
While some believe that adult literacy should be defined as the acquisition of a set 
of generalized skills revolving around reading, writing, and computation, or, as 
with the National Adult Literacy Survey (Kirsch, Jungeblut, Jenkins, & Kolstad, 
1993), prose literacy, document literacy, and quantitative literacy, others argue 
that literacy cannot be defined outside the context in which it is used. Accepting 
the context-based definition, there are multiple literacies rather than a single 
literacy. Perhaps because it has been so difficult to operationalize the construct of 
adult literacy education for measurement, there has been a tendency to select more 
global goals as the basis for outcomes and impacts in accountability and program 
improvement. For example, in a policy paper commissioned by the U.S. 
Department of Education’s Office of Vocational and Adult Education, Condelli 
and Kutner (1997) identify seven relatively global outcome accountability 
measures: economic impact, credentials, learning gains, family impact, further 
education and training, community impact, and customer satisfaction. Except for 
learning gains and perhaps credentials, none of these goals has to do directly with 
the skills and knowledge expected from adult literacy education. 

A focus on global goals raises the issue of what one can realistically 
expect adult literacy education to achieve. Adult literacy education is an 
educational program. Thus it is obviously reasonable to expect learners to learn, 
that is, to acquire knowledge, skills, and new perspectives as a consequence of 
their participation. But to what extent can adult literacy education be expected to 
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produce outcomes that are at best indirectly related to what is learned in an adult 
literacy program? For example, since there are a multiplicity of variables that 
affect whether a person who completes adult literacy instruction will gain 
employment, and since many have to do with the state of the economy and the life 
situation of the individual, can adult literacy education reasonably be held 
accountable for its graduates’ employment status? If the answer is yes, given the 
other potent variables that affect learners’ economic gain, how much gain should 
be expected purely as a consequence of participation in adult literacy education? 

MEASUREMENT 

It is axiomatic to say that an outcome variable cannot be used for the 
purposes of outcome assessment unless it can be adequately measured. Yet 
measurement in adult literacy education has proved to be quite problematic. 
Tested learning gain is a case in point. Tested learning gain is considered to be an 
important outcome measure in this study because it has been employed in many 
studies of outcomes and impacts and because it is presumed by many to be a 
direct measure of instructional gain. Yet the measurement problems associated 
with tested learning gain are substantial. 

First of all, there is controversy over whether learning gain can be 
adequately measured through standardized testing. For example, Fingeret and 
Drennon (1997) suggest that, because the shared meanings learners ascribe to 
symbols vary in different cultural contexts, literacy should be considered as 
practices that differ as context changes. If the very definition of literacy varies 
substantially with context, it would be difficult and probably impossible to 
measure something so specific with a generalized standardized test. Then there is 
the issue of test sensitivity. One of the contradictions in outcome measurement is 
that detailed qualitative studies such as those conducted by Fingeret and Danin 
(1991) and Fingeret (1985) show that learners do report literacy gains that are 
important to their lives, whereas studies that use standardized tests such as the 
TABE tend to show small, and in some cases no, gains. The reason may be that 
many of the personally important gains learners achieve are too small to be 
recorded on standardized tests. This may be especially true for beginning readers. 
For example. Heath (1983) notes the important impact of being able to write a 
simple memory list or note to one’s children for adults who could not previously 
perform these tasks. Yet despite the impact on learners’ lives, these gains would 
probably not register on most standardized tests. 
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Because there are multiple definitions of adult literacy education, and 
because under the Adult Education Act states have great latitude in establishing 
testing policy, there is no single standardized test that is appropriate or in use. 
Indeed, as the National Evaluation of Adult Education Programs found 
(Development Associates, 1992), of those programs that test regularly, 68 percent 
used the Test of Adult Basic Education (TABE), 12 percent used the Slosson Oral 
Reading Test (SORT), 21 percent used the Adult Basic Learning Examination 
(ABLE), 20 percent used the Wide Range Achievement Test (WRAT), 14 percent 
used the Comprehensive Adult Student Assessment System (CASAS), and 3 1 
percent used locally developed tests. Over one third of the programs did not test 
at all at student intake. Clearly the lack of a common measure for tested learning 
gains makes accountability for learning gain difficult, especially at the national 
program level. 

Just as serious as the lack of a common measure is the appropriateness of 
the measures available. For a test of learning gain to be appropriate, it must 
reflect what is taught in instruction. Yet what is taught in instruction can vary 
widely among programs and states. An example that highlights the problem is the 
fact that some states require all programs to pretest and posttest using a 
standardized test, the TABE being the most common. Although the TABE may 
be appropriate for those programs that use a general, basic skills approach to 
literacy, it is not appropriate for those programs that gear instruction to a 
particular context. Workplace literacy, which focuses instruction on the tasks 
learners use in the workplace, is an excellent example of a contextualized 
approach. Because workplace literacy instruction is not geared to the specific 
skills the TABE measures, the TABE is likely to underestimate literacy growth, 
and when basic skills-oriented programs are compared to workplace literacy 
programs on the TABE, workplace literacy is likely to score lower regardless of 
the quality of instruction. 

Lack of adequate controls in measurement is another serious problem. A 
commonly used measure of economic gain, for example, is increased income for 
employed learners who have completed the program. Increases in income, 
however, are affected by many powerful factors that have nothing to do with 
literacy acquisition — factors such as the rate of inflation, increases in the 
minimum wage, the strength of the economy, and job tenure. When increases in 
income are measured by pre-program and post-program measures, or more simply 
by a post-program design, it is simply impossible to infer that the increase (or 
decrease, for that matter) was caused by participation in adult literacy education. 
The only way causality can be inferred is through the comparison of those who 
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completed adult literacy education with a group of like individuals who did not 
participate in adult literacy education. Then the difference between the groups 
with respect to income gained can be inferred to have been caused by 
participation. The point is that, unless it is known with confidence that 
participation in adult literacy education caused a particular outcome, such as the 
achievement of increased income, little is known, and if little is known, how can 
reasonable policy be made? For many reasons that will be discussed later, 
however, it is so difficult to create adequate meaningful comparison groups in the 
real world of adult literacy education that few outcome studies have ever done it. 

Measures that rely on learners’ self-reports also raise an issue. There is 
nothing inherently wrong with self-report measures. Indeed, researchers often 
have no alternative but to use self-report. When learners are asked to self-report 
on items that are essentially objective, such as their gender and age, it can be 
assumed that truthful responses are being given. However, when learners are 
asked to self-report on questions for which some answers are socially acceptable 
and other answers are not, response bias becomes a concern. Response bias is 
also a concern for questions that require a long period of recall, questions that ask 
for information the learner might not know “offhand,” and questions that are 
vague. 



Asking learners to self-report on their reasons for dropping out is a case in 
point. Dropping out of school carries with it a social stigma. Thus, when asked 
this question, there is the potential for learners to respond with socially acceptable 
answers, such as sickness, lack of transportation, etc., and avoid socially 
unacceptable responses, such as lack of learning progress or conflicts with 
teachers and/or other students. Because self-report measures are susceptible to 
response bias in many cases, objective measures are usually preferable, and 
outcome studies that rely extensively or exclusively on self-report must be 
regarded with a degree of suspicion. 

CAPACITY 

The issue of capacity has two dimensions with respect to outcome 
assessment: the extent to which programs have the capacity to systematically 
gather the data necessary for outcome-based evaluation and the extent to which 
programs have the capacity to achieve goals established for them as policy. 

From 1990 to 1994, the National Evaluation of Adult Education Programs 
(NEAEP) conducted an assessment of the federal adult literacy education 
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program. In the end, the evaluation proved to be as much an assessment of the 
system’s capacity to provide useful outcome information as it did an evaluation of 
the system itself Excerpts from NEAEP’s Executive Summary (Young et al., 
1994b) demonstrate this point: 

For inclusion in the evaluation, each program had to receive financial 
support through the basic grants provisions of the Adult Education Act. 
However, there is no available list of grantees, and funding practices and 
definitions vary among the states, (p. 6) 

Information at the state level on local programs varies considerably in 
content and quality. Some programs did not have information on the 
composition of staff or the nature of instmction provided at different sites. 
Nor did many programs have any precise idea of the number of adults 
newly enrolled each year or the number of different individuals enrolled at 
any given time or over the period of a program year. (p. 7) 

Key personnel and the location of instmctional sites changed during the 
course of the study in many projects. Within the first six months of data 
collection, for example, 1 6 percent of program directors trained in the 
requirements of the study had departed, sometimes because their Jobs had 
vanished, (p. 7) 

The experience of the NEAEP and others who have attempted to conduct 
outcome evaluations partially or totally for the purpose of accountability 
demonstrate the following problems with the system’s capacity to generate 
accurate outcome data; 

• Unlike elementary, secondary, and higher education, where learners enroll and 
complete at discrete times, adult education programs tend to maintain open 
enrollment and learners attend irregularly. This makes it very difficult to 
collect complete data on learners at specific time intervals. 

• The dropout rate for adult literacy education is quite high, approximately 60 
percent after 12 weeks (Young et al., 1994b). Furthermore, it is generally not 
known if non-persisters represent completions of learners who have met their 
goals, or whether non-persistence represents dropout for other reasons. 
Because non-completers are difficult to follow up, “experimental mortality” 
typically inteijects a strong source of bias into learner data. 
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• Program staff lack expertise in data collection, particularly testing. 

Furthermore, programs typically lack the resources to insure accurate data 
collection. Thus a significant amount of the data collected at the program 
level is of questionable accuracy. 

Program capacity has a second dimension. If adult literacy education is to 
be expected to meet outcome standards, then the system must have the capacity to 
meet outcome expectations. However, because of low per-student expenditures, a 
reliance on part-time staff, fragmentation of service, and structural marginality in 
comparison with elementary, secondary, and higher education, the adult literacy 
education system’s capacity to meet the expectations of either learners or the 
public is low (Beder, 1996). Indeed, as the NEAEP found, mean expenditures per 
student per year are about $258, and over 80 percent of the teachers of adult 
literacy work part time (Development Associates, 1992). To the extent that 
system capacity is low, achievement of outcome accountability must include a 
greater allocation of resources to program development as well as increases in the 
quantity and quality of outcome evaluation and the development of pertinent 
outcome standards. 



JUDGMENT 

As with any form of evaluation, outcome assessment requires judgment 
regarding whether an obtained outcome is exemplary, adequate, or sub-standard. 

A comparative approach to judgment entails comparing the results of one program 
or system with others. An external standards approach entails judgment with 
respect to established standards. Unfortunately, in adult literacy education, there 
has been little basis for judgment. First, as this report will show, because the 
outcome evaluations that have been conducted vary so widely in methods and 
measurement, and because their internal and external validity has been almost 
universally questionable, there is little basis for comparative judgment. Second, 
specific external standards have not as yet been established. For example, 
learning gain is a fundamental outcome that might be expected from participation 
in adult literacy education. Yet we do not know what constitutes adequate 
learning gain on any particular measure. 

Unless reasonable judgments can be made from outcome assessment, there 
is no meaningful way to separate good practice from poor practice, and the 
opportunity for program improvement and development is lost. 



NCSALL Reports #6 



January 1999 



CHAPTER 2: METHODOLOGY 



GOALS 

This study has four primary goals: 

• To make reasoned inferences about the effectiveness of adult literacy 
education in the United States. 

• To identify common conceptual, design, and methodological problems 
inherent in the outcome studies conducted. 

• To raise and discuss issues for policy. 

• To make recommendations for the design and conduct of future outcome 
studies. 



RESEARCH STRATEGY 

Based on an earlier analysis of the outcome literature on adult literacy 
education (Beder, 1991), it was not anticipated that a definitive outcome study (or 
studies) of adult literacy existed from which logical conclusions regarding 
program effectiveness could be inferred. When the outcome literature was 
reviewed in conjunction with this study, this, in fact, proved to be the case. 
Consequently, the strategy here was to analyze a wide range of outcome studies in 
order to make reasonable inferences about effectiveness from patterns of findings 
while taking research limitations into account. The analysis was qualitative in 
orientation. Although it was hypothetically possible to conduct a quantitative 
meta-analysis for some outcome variables — tested learning gain, for example — it 
was determined that valid data from a sufficient number of studies did not exist. 
Furthermore, statistical information critical for a quantitative meta-analysis was 
generally not reported for the existing studies. 

The first step was to identify a pool of research studies conducted since the 
inception of the federal adult literacy program that were available in the public 
domain and potentially included outcome assessment components. ERIC and 
other abstracting services were searched using numerous descriptors, and state and 
national policy makers were consulted regarding studies the search might have 
missed. The initial search identified approximately 115 studies. Next, abstracts 
were reviewed to determine which studies in the initial pool did, in fact, include 
outcome components. These studies were ordered in hard copy when available 
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and secured in microfiche when hard copy was not available. Sixty-eight studies 
that included an outcome component were identified and acquired for assessment. 
Subsequently, each study was abstracted and evaluated according to the following 
criteria: 

• The study included an outcome/impact component. 

• The report was adequately documented with respect to design and methods. 

• There were an adequate number of cases. 

• The sampling plan was adequate (i.e., it could and did result in external 
validity). 

• Data collection procedures were adequate (i.e., they were not tainted by 
substantial attrition or biased by other factors). 

• Objective measures, rather than self-report, were used to measure outcomes. 

• Measures, especially tests, were valid and reliable. 

• The research design included a control group. 

• Inferences logically followed from the design and data. 

No study fully met these criteria. Finally, those studies that were assessed 
as being the most credible based on the above criteria were selected for in-depth 
analysis. In selecting the most credible studies, the researchers initially 
experimented with a numerical rating system that assigned points for each study 
selection criterion. However, this method was abandoned when it proved to be 
ineffective. Under the numerical rating system, for example, it was possible for a 
study that included a control group, used objective measures, and had an adequate 
number of cases, but that also made poor or incomplete inferences, to score 
inappropriately high. Conversely, a methodologically more limited study that 
made intelligent and sound inferences despite its acknowledged limitations could 
score inappropriately low. In the end, a more holistic system of selection was 
employed in which two researchers read all outcome and impact studies identified 
by the project, developed a list of the studies they considered most credible, and 
resolved differences by discussion. There was 100 percent consensus on the 
studies eventually selected. 

In order to represent a broad range of adult literacy contexts, studies were 
organized into five categories: national, state-level, workplace literacy, welfare, 
and family literacy. The quality of studies varied among these categories. For 
example, although there had been a proliferation of workplace literacy studies 
since 1991, when the federal workplace education program was established, many 
local-level workplace education studies were poorly conceived and conducted on 
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meager resources. In contrast, because of the resources available, the national 
studies tended to be a higher quality. While it was originally anticipated that 1 0 
to 12 stu(|ies would be selected for in-depth analysis, 23 studies were eventually 
selected. Studies that focused on English as a second language, the incarcerated, 
and the handicapped were excluded from analysis. 



The data for the 23 case studies that follow were acquired from published 
research reports that vary in completeness and clarity. Hence the case studies 
necessarily vary accordingly. For many of the studies, measuring outcomes and 
impacts was but one of several research objectives. In such cases, only the 
portions that pertain to outcomes and impacts are reflected in the case studies. 
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CHAPTERS: ANALYSIS 



NATIONAL STUDIES 

There have been three national evaluations of the federal adult literacy 
program funded under the Adult Education Act. The first was conducted in 1973 
(Kent, 1973), the second in 1980 (Young et al.), and the most recent between 
1990 and 1994 (Development Associates). In addition, there is a 1968 study that 
evaluates components of the adult literacy education program during the period 
when the program was administered by the Office of Economic Opportunity, 
prior to the enactment of the Adult Education Act (Greenleigh Associates, 1968). 
There has also been a national assessment of the Even Start Program (St. Pierre et 
al., 1993) which will be reported in the section on family literacy. Although most 
of the national evaluations focused primarily on input and output analyses, each 
included at least one component which dealt with outcome and impact concerns. 
Studies are reviewed chronologically, the most recent coming first. 

1. The National Evaluation of Adult Education Programs (NEAEP) 

1990-1994 



NEAEP Reports: 



Development Associates. (1992). National Evaluation of Adult Education 
programs first interim report: Profiles of service providers. Arlington, VA: 
Author. 



Development Associates. (1993). National Evaluation of Adult Education 
programs second interim report: Profiles of client characteristics. Arlington, VA: 
Author. 



Development Associates. (1994). National Evaluation of Adult Education 
programs third interim report: Patterns and predictors of client attendance. 
Arlington, VA: Author. 



Young, M. B., Fitzgerald, N., &. Morgan, M. A. (1994a). National Evaluation of 
Adult Education programs fourth interim report: Learner outcomes and program 
results. Arlington, VA: Development Associates. 
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Young, M. B., Fitzgerald, N., & Morgan, M. A. (1994b). National Evaluation of 
Adult Education: Executive Summary. Arlington, VA: Development Associates. 



Reanalysis: 

Cohen, J, Caret, M., & Condelli, L. (1996). Reanalysis of data from the National 
Evaluation of Adult Education Programs: Methods and results. Washington, DC: 
Pelavin Research Institute. 



The NEAEP was the most comprehensive and expensive evaluation — the 
initial contract was for $2,839,740 — of the federal adult literacy education 
program yet conducted. Data collection began in 1990, and the final Executive 
Summary was issued in 1994. NEAEP’s results are presented in four separate 
interim reports and a final executive summary. The first report provides a 
descriptive analysis of service providers, the second describes client 
characteristics, the third focuses on predictors of client attendance, and the fourth 
is concerned with learner outcomes and program results. Virtually all the 
outcome and impact data are found in the fourth report and Executive Summary. 

In addition to the reports published by the NEAEP, the report of a reanalysis of 
the NEAEP data conducted by the Pelavin Research Institute is highly relevant to 
this analysis. 

Data Collection Procedures 

NEAEP utilized seven procedures to collect data (Young et al.. 1 994a). 

The Universe Survey was designed to collect descriptive data from all federally 
funded service providers. Administered by mail in 1 990, the Universe Survey 
collected data from 2,619 programs, 93 percent of the universe. 

The Comprehensive Program Profile was used to collect more detailed 
descriptive data from a smaller number of programs (n= 131). Results were 
weighted to permit general izability to the universe. For programs that provided 
client data from more than one site, site-level data rather than program-level data 
were used in analysis. 

TheClient Intake Record A was completed for each sampled student at the 
time of intake by program staff trained by the researchers. Client Intake Record A 
collected basic demographic and client-related program information. Analysis is 
based on 22,548 respondents from 1 1 6 local programs. 
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The Client Intake Record B was completed for all sampled clients who 
completed at least one instructional session. It collected more detailed 
information, such as reliance on public assistance, living arrangements, 
employment status, and reasons for participation. Data were collected from 
13,845 learners in 108 programs. 

The Client Update Record provided instructional and attendance data. It 
was completed by program staff at five- to eight-week intervals during a period 
that started at program entry and ended up to 18 months later. Data were 
collected from 18,461 learners in 1 10 programs. 

The Client Test Record yielded learners’ test scores on either the 
Comprehensive Adult Student Assessment Survey (CASAS) or the Test of Adult 
Basic Education (TABE). Pretests were administered near the inception of 
enrollment and posttests were administered at varying intervals during instruction. 
Pretest scores were obtained from 8,581 learners in 88 programs and posttest 
scores were obtained from 1,919 learners in 65 programs. 

The Telephone Follow-up Survey yielded data on the quality of 
instruction, reasons for termination, and the results of instruction from a sample of 
clients six months after they had terminated instruction. Data were obtained from 
5,401 clients in 109 programs. Most of the data reported from the telephone 
survey pertains to 4,653 subjects who had attended at least three classes. As the 
NEAEP notes: 

As described in Appendix C, the weighting of the telephone survey 
essentially adjusts the results for non-response bias. After weight 
adjustments were applied to the telephone survey, respondents were very 
similar to the clients of the national sample who began instruction, except 
for having received fewer hours of instruction and having been out of adult 
education for six months. (Young et al., 1 994a, p.46) 

Commenting on the telephone survey, Cohen, Caret, and Condelli (1996) noted 
that “our review of the telephone survey found that the survey respondents 
differed from non-respondents in many ways, suggesting that estimates from this 
data ought not to be generalized to the population of clients” (p. xi). 

Although the NEAEP provided a detailed and comprehensive description 
of program characteristics, staff characteristics, instructional practices, and learner 
attendance patterns, it is the outcome data that is of concern here. Most outcome 
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data were collected from the Client Record Forms, the Client Test Records, and 
the Telephone Follow-up Survey and can be found in the Fourth Report (Young et 
al., 1994a). 

Problems and Issues 

The NEAEP study is particularly important for two reasons. First, many 
of the findings — especially the descriptive information on programs, staff, and 
learners — are useful in their own right. Second, and perhaps ultimately more 
important, the problems the NEAEP encountered provide us with a detailed case 
study of the problems of conducting large-scale outcome and impact assessments 
in adult literacy education. In fact, the problems experienced by the NEAEP in 
data collection proved to be so severe that the validity of the NEAEP’s findings 
were called into question by the U.S. Department of Education, which 
commissioned the study. Because of these reservations, a second contract was 
awarded to the Pelavin Research Institute to “conduct a comprehensive review of 
the study methodology, quality of data, and statistical methods used in prior 
analysis; and to validate reported findings, make needed corrections and conduct 
new analyses” (Cohen, Caret, & Condelli, 1996, p. vi). For this reason, and 
because the NEAEP’s limitations are well documented, the problems and lessons 
associated with conducting the NEAEP will be discussed here in detail. As the 
reader will discover, to one degree or another many of the issues raised by the 
NEAEP pertain to most of the outcome and impact studies in adult literacy 
conducted to date. 

At the heart of the NEAEP was to be a set of approximately 1 50 local 
programs that were to provide detailed program information and longitudinal (18- 
month) information on learners, including pre- and posttesfs. To enable 
generalization from programs and their clients to the entire federal adult literacy 
education program, participating programs were to be selected using a probability 
of selection proportionate to size methodology (Development Associates, 1992). 
In accord with this methodology, 1 8 programs with 20,000 or more participants 
were so large that they were automatically selected for the study (i.e., they had a 
probability of selection equal to 1). 

Operational Definitions 

Although this procedure appears to be simple, in reality it was not. First 
there was the issue of defining what constituted a “program.” The NEAEP 
defined a program as the administrative unit that served as the sub-grantee for 
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federal funds. As such, the City of New York and the Los Angeles Unified 
School District were defined as programs. As the NEAEP explained: 

Receipt of such funds, however, did not adequately target the 
administrative entities that should be included in the evaluation, for 
example, more than one grant may be made to the same administrative 
agent, such as separate grants for the ABE, ASE and ESL instructional 
components. Or sometimes a basic grant is awarded to a regional 
administrative service agency that has several subgrantees, some of which 
may be local school districts and other community-based organizations; 
and grantees exercise varying degree of administrative control over the 
service delivery agencies. (Young et al., 1994a, p. 4) 

The definition of “client” was also an issue. A client was conceptually 
defined as one who received services directly supported by the Adult Education 
Act. Yet because of multiple funding streams that flowed to local programs from 
state funds, JOBS, JTPA, and private sources, it was often difficult to determine if 
a given client was wholly, partially, or not at all supported by Adult Education 
Act funds. Learners who changed from classes supported by one funding source 
to classes supported by another funding source during the course of the evaluation 
caused further problems in definition. When a client became a client for the 
purposes of the study was also problematic. Although eligibility for the study was 
intended to occur at client intake, the NEAEP discovered that 16 percent of those 
who completed intake never engaged in instruction. Furthermore, in its reporting 
the federal government counts only those learners who have completed 1 2 hours 
of instruction as participants. The NEAEP found that 1 7 percent of those who 
completed the intake process never achieved 12 hours of instruction (Young et al., 
1994a, p. 10). 

The point is that it is impossible to select a sample that represents the 
universe of adult literacy education programs and clients if one caiuiot precisely 
define the universe and its components in operational terms. 

Program, Subject, and Site Attrition 

More serious, perhaps, was program and site attrition from the study. 
Although the NEAEP provided economic and other incentives for programs to 
cooperate in data collection, it had no direct authority over the participating 
programs. As a result: 
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Our goal was to enlist participation of 150 local programs. When data 
collection began, 141 programs had agreed to participate, 1 14 from the 
initially selected set of 150, 25 first order replacements, and two 
replacements of replacements. After 10 months of data collection, 2 of the 
originally selected 150 had terminated operations, 3 had formally 
withdrawn from the evaluation, and another 5 had failed to submit data. 
Within the first six months, 16 percent of program directors trained in the 
requirements of the study had departed, sometimes because their positions 
had been abolished. (Young et al., 1994a, p. 5) 

Program and site attrition from the study and incomplete data supplied by 
programs wreaked havoc with the NEAEP’s weighting protocol based on program 
size and made the problem of generalizing to the universe complex. As a case in 
point, the adult literacy education program provided by the Los Angeles Unified 
School District is so large that it was automatically included in the study. Indeed, 
Los Angeles served about 10 percent of the national population and was about 72 
percent ESL (Cohen, Caret, & Condelli, 1996). However, the data supplied by 
Los Angeles were submitted late and differed from what might have been 
expected to such an extent that its accuracy was severely questioned. The 
problem is that, because of Los Angeles’ size, if the data is accepted as accurate 
and included, NEAEP’s findings differ substantially — especially with respect to 
ESL and variables associated with ESL — from when the data is not included. 
NEAEP included the Los Angeles data; the Pelavin reanalysis excluded it and 
adjusted the findings to account for the deletion. Commenting on the problem 
program and site attrition caused for the weightings, and hence the 
generalizability of the findings, the Pelavin re-analysis stated: 

If relatively complete sample coverage had been achieved, it would have 
been possible to estimate population parameters using appropriate design 
weights. Given the substantial non-coverage of clients within sites, 
however, as well as the substantial non-response at the site and program 
level, weighted estimates are likely to be biased, although the direction 
and extent of the bias is difficult to assess. (Cohen, Garet, & Condelli, 
1996, p. viii) 

With respect to the outcome data collected by the NEAEP, the inability of 
programs to supply accurate information at specified time intervals was a very 
serious problem. 
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Originally it had been hoped that 1 50 programs would supply pretest and 
posttest data on participants. Ultimately, 101 agreed. Yet these programs were 
only able to provide pretest data on 57 percent of the potential base level of clients 
(1 1,354 out of 19,796). Furthermore, “the number of matched scores [pre- and 
posttest] obtained for clients on all tests constituted only 12 percent of the clients 
originally expected (i.e., 2,333 out of 19,796)” (Yoimg et al., 1994a, p.30). 
Although posttesting was to occur at fixed intervals, so many posttests submitted 
were not administered at the stipulated intervals that the NEAEP had to abandon 
the fixed interval protocol and instead report average hours of instruction for the 
posttest data. To compound things, after analysis of the test data submitted, the 
NEAEP discovered that much of it was either incomplete or tainted by ceiling or 
floor effects. Consequently, of the 19,796 matched pre- and posttests that the 
NEAEP had originally hoped to obtain, only 614 usable cases were obtained for 
analysis. Furthermore, 50 percent of the learning gain data on ABE/ASE was 
obtained from but three sites; 75 percent of the ESL data came from nine sites 
(Cohen, Caret, & Condelli, 1996). Assessing the validity of the tested learning 
gain data in relation to the problems encountered in data collection, the reanalysis 
study concluded: 

The implementation of the test plan was also poor, and this data should not 
be used to assess the capabilities of clients at intake. Some of the key 
evidence supporting this conclusion includes: 

• Only half the clients were pretested, and sites that pretested differed 
from sites that did not. At sites that pretested only some of their 
clients, pretested clients differed from those who were not pretested. 

• Programs reported perfect exam scores for a substantial proportion of 
pretested clients. 

• Less than 20 percent of eligible clients received a matched pretest and 
posttest. 

• Among clients eligible to be posttested, significant differences exist 
among those who were and were not posttested. 

•. The available matched pre- and posttests were concentrated in a very 
few programs. 

These facts render the test data imusable. Therefore this reanalysis 
invalidates all of the findings concerning test results from the original 
analysis. (Cohen, Caret, & Condelli, 1996, p. xi) 
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Logistical Constraints 

NEAEP’s experience suggests that there were at least two reasons why it 
is difficult to collect accurate and timely data from local programs when local 
program staff are assigned to keep records and administer tests. First, the 
workforce in adult literacy education is predominately part-time. Indeed, the 
NEAEP foimd that over 80 percent of the teachers teach part-time (Development 
Associates, 1992). It is difficult to expect part-time staff who devote most of their 
time to instruction and have little training or experience in research to accurately 
collect the kind of data required of a credible outcome and impact evaluation. 

More important, perhaps, is that most adult literacy education programs maintain 
open emollments, attendance tends to be irregular, and dropout rates tend to be 
high. These factors make it very difficult to collect data from learners at regular 
intervals, and the high attrition rates inject the threat of attrition bias. 

Findings 

The outcome and impact variables measured by the NEAEP included 
tested learning gain, clients’ self-report of learning gain, employment, further 
education, clients’ assessment of personal goal attainment, and how often clients 
read to their children. 

Tested Learning Gain 

Because the NEAEP lacked the resources to administer a common test to 
all subjects, and because variation in instructional goals and processes among 
programs made it inappropriate to administer a common test, the NEAEP used the 
tests that were already being administered in the programs selected. Although 
there are a number of instructional tests employed by local programs, there were 
but two tests that were in sufficient usage to yield enough cases for the national 
evaluation — the California Adult Student Assessment System (CASAS), which 
was used to measure learning gain for ESL, and the Test of Adult Basic Education 
(TABE), which was used for adult basic education (ABE) and adult secondary 
education (ASE). As already mentioned, the NEAEP was able to obtain only 614 
valid cases from an intended 19,796 potentially available cases. Because intervals 
between pre- and posttests varied, the mean hours of instruction between pre- and 
posttests was reported. For ESL students, who on average received 120 hours and 
14 weeks of instruction between the pre- and posttests, the learning gain on the 
CASAS was five scale points. ABE students received a mean of 84 hours of 
instruction between pre- and posttests and attended for an average of 1 5 weeks. 
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On average their gain was 15 points on the TABE. Adult secondary students 
received a mean of 63 hours of instruction and gained 7 points on the TABE. All 
gains were statistically significant (one sample Mests) at the .001 level (Young et 
al., 1994a). 

What do these results mean? It is difficult to say. First there is the issue 
of the data’s validity. Although Development Associates took care to “clean” the 
test data, when only 614 acceptable cases were obtained from a potential of over 
19,000 cases, validity is still a very major issue. The second issue has to do with 
standards. What is an acceptable learning gain after the mean hours of instruction 
reported? We do not know. Development Associates (Young et al., 1994a) notes 
that ABE students gained from an average 6.1 grade level on the TABE to a 7.4 
grade level and that adult secondary students gained from a 8.5 grade level to a 
9.3 grade level, but does this represent high, low, or moderate gain? Moreover, 
are grade level changes, as normed on elementary and secondary school students, 
valid indicators of learning gain in adult literacy education? Many would say they 
are not. 



The NEAEP was only able to obtain matched pre- and posttests on 12 
percent of the potential number of clients expected. Thus a high proportion of 
those who were pretested were not posttested. Did those who were both pretested 
and posttested differ from the original sample in important ways? Were they 
more able or less able? This is an important issue, especially in adult secondary 
education. There is anecdotal evidence that some of the most able adult 
secondary education students leave programs after but a few hours of instruction 
to take, and eventually pass, the GED. Were those ASE students who were 
available for posttesting the least able of those adult secondary education students 
who enrolled? We do not know. Finally, how do we know that the gains reported 
resulted from adult literacy education instruction? The only way to answer this 
question convincingly would have been to have had a control group with which to 
compare the gains. Without a control group, it cannot be determined whether the 
gains were real or due to other factors, a particular concern in ESL where 
presumably students were interacting v^th the English language outside of class. 

Through multiple regression analysis, the NEAEP went on to determine 
which backgroimd, attendance, and program variables influenced learning gain for 
ABE, ASE, and ESL. Two findings from this analysis are of particular import. 
First, almost two thirds (61 percent) of the variance for learning gain in reading 
for ABE was found to be accounted for by the pretest score in reading, which may 
be taken as a general measure of ability. For ESL the score was 48 percent of the 
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variance, and for ASE the score was 1 9 percent. These findings suggest that, 
especially \vith respect to adult basic education, a learner’s initial “ability” is a 
potent predictor of learning gain. Second, total hours of instruction were not 
shown to be related to tested learning outcomes for either ABE or ASE. This 
finding runs counter to conventional logic which would assume that the more time 
adult literacy students spend in class, the more they would learn. The NEAEP did 
not offer explanations for the finding, but assuming that the finding is valid, the 
relationship between tested learning gain and hours of instruction certainly 
warrants more research. 

Most of the findings that pertain to employment, learners’ goals, and 
further education were derived from NEAEP’ s telephone survey of former 
leeimers who had not attended for six or months. From 109 local programs, 5,401 
clients responded, 86 percent of whom had attended for at least three sessions. 
Learners were asked if participation in the program had helped them improve their 
basic skills. For ESL, 44 percent responded “a lot” for reading and writing, 26 
percent for mathematics, 48 percent for speaking and listening, and 62 percent 
reported that they had been helped a lot in at least one skill level. For ABE the 
figures for “a lot” were 50 percent for reading and writing, 51 percent for 
mathematics, 48 percent for speaking and listening, and 68 percent reported that 
they had been helped in at least one skill level. For ASE the comparable figures 
were 45 percent reading and writing, 49 percent mathematics, 45 percent speaking 
and listening. Sixty-three percent reported they had been helped “a lot” in at least 
one skill level. The NEAEP examined the extent to which learners’ reports of the 
benefits they attained in reading coincided with tested improvement in reading 
and found convergence in 58 percent of the cases. 

Employment and Further Education 

With respect to employment, 63 percent of the leeimers reported that they 
were unemployed when they entered the program and 69 percent were employed 
at the time of follow-up. However, without a control group it is impossible to 
determine whether these statistically significant, but modest, gains were due to 
instruction or to other unknown reasons. In fact, when NEAEP asked those who 
became employed between enrollment and follow-up if what they learned in the 
program helped them get a job, a majority (57 percent) said no, suggesting that 
leeimers perceived that factors other than adult literacy education were critical for 
job acquisition. With respect to further education, of those leeimers who did not 
possess a high school diploma at intake, at follow-up 18 percent were enrolled in 
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further education (11 percent post-secondary, 6 percent GED, 1 percent other), 44 
percent had no plans to enroll, and 38 percent expected to enroll within a year. 

Self-image and Learner Satisfaction 

Although enhancing learners’ self-image and self-esteem are not stated 
goals of the federal adult literacy education program, improved self-image is a 
common variable in outcome and impact studies. The NEAEP found that 65 
percent of the learners reported that they felt better about themselves at follow-up. 
These data are mitigated by the fact that the follow-up sample included both 
respondents who had terminated because they attained their goals and respondents 
who had dropped out for other reasons. When respondents were asked at follow- 
up why they left the program, 41 percent were designated by NEAEP as having 
“left satisfied,” 45 percent were designated as having left for the sake of outside 
events beyond their control, and 7 percent left for instructional factors. On the 
average, satisfied respondents had substantially more hours of instruction than 
respondents who had left for other reasons. 

Strengths 

• With respect to its variables, data sources, and scope, the NEAEP is the most 
comprehensive national evaluation of the federal adult literacy program to 
date. It provides a rich descriptive picture of adult literacy education in the 
United States. 

• The NEAEP’ s reports are clearly written, adequately documented with respect 
to methods and procedures, and honest in their portrayal limitations. 

Limitations 

• Attrition of programs, sites, and subjects from the study compromised the 
original sampling plan based on program size. Although the NEAEP adjusted 
the weightings to account for program and subject attrition, NEAEP’s 
adjustments were criticized in the reanalysis study. Thus the validity of the 
NEAEP with respect to the outcome data is problematic. 

• Because of data collection problems, only a small proportion of the intended 
posttest data was obtained and the intervals between pretest and posttest 
varied widely. The accuracy of the test data has also been questioned. The 
data collection problems associated with the test data were so severe that it is 
questionable whether any reasonable inferences can be made from these data. 
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• Lack of a control group casts doubt on an inference that the outcomes 
obtained were caused by participation in adult literacy education. 

• With the exception of the learning gain data, outcomes are based on learners’ 
self-reports of benefits acquired. 

When the limitations of the NEAEP are considered with respect to 
outcome findings, it must be concluded that at best the findings are suspect and at 
worst they are unusable. Perhaps because the NEAEP cost almost $3 million and 
took four years to complete, the evaluation was subject to a considerably higher 
level of scrutiny than the national evaluations that preceded it. Because the data 
collection methods of the earlier evaluations are in many ways similar to the 
NEAEP, the suspicion lingers that the earlier national evaluations, too, were 
flawed in ways similar to the NEAEP and that these flaws either went unnoticed 
or were not reported. 



2. The 1980 National Evaluation 
An Assessment of the State-Administered Program of the 
Adult Education Act 



Young, M., Hipps, J., Hanberry, G., Hopstoch, P., & Golsmat, R. (1980). An 
assessment of the state-administered program of the Adult Education Act: Final 
report. Arlington, VA: Development Associates. 



As with the NEAEP, the 1980 national evaluation of adult literacy 
education was conducted by Development Associates under contract from the 
U.S. Department of Education. Work commenced in 1978, most data were 
collected in 1979, and the final report (Young et al., 1980) was issued in 1980. 
The general purposes of the evaluation were: “(a) to provide an analytic 
description of the Grants to States program, with particular emphasis on program 
participants and (b) to identify a set of impact measures that could be studied in a 
longitudinal design” (Young et al., 1980, p. 1). Because the evaluation was 
primarily descriptive, the collection of comprehensive, generalizable outcome and 
impact data on participants was not a major objective of the study. 

State-level data for the evaluation were gathered through a mailed survey 
to state directors, of adult educatioa in the 50 . states. The response rate was 100 
percent. Local program data were collected fi'om a national probability sample 
(n=420) of local programs in 47 states. Four hundred and four local directors 
responded, for a response rate of 96 percent. 
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The outcome and impact data for the study were collected from a 
separately drawn sample of 1 10 local programs stratified according to type of 
sponsoring agency and size of program. At each local program, data were 
collected from directors, a randomly selected sample of teachers, and present and 
former participants. The intent of the evaluation was to collect data from learners 
who had completed at least one adult education course or who had terminated the 
program. Thus subjects were selected who had completed a course the semester 
prior to learner data collection. 

Each randomly selected teacher from among the 1 10 selected programs 
was asked to list the students he/she had taught during the spring of 1979. In 
order to obtain a sufficient number of learners per site, a minimum of 25 
participants were selected at each site. If the program selected had five or more 
teachers, five learners were selected from each teacher’s class list at random. Two 
kinds of data were obtained about learners. Descriptive data regarding learner 
characteristics and participation were collected from program records during site 
visits to the 110 programs. The second kind of data, upon which the outcome 
findings rely, was collected from selected learners through interviews that were 
generally conducted over the telephone. Although at least three attempts were 
made to reach each sampled learner, the final response rate was but 38 percent 
(1,177 cases out of a sampled 3,061 cases). Less than half (43 percent) the valid 
cases were still active in the program at the time of data collection. Because of 
the relatively low response rate, and because it was determined that those 
interviewed differed from those who could not be contacted, no attempt was made 
to generalize the findings to adult literacy education in general in the United 
States. 

Findings 

Outcome variables measured included participants’ reports of personal 
goal attainment, self-reported gains in self-concept, basic skill acquisition, getting 
along better with one’s family, getting a better job, and further education. 

In the interviews, learners were asked if they had reached, or were in the 
process of reaching, the goals they had set for themselves when entering the 
program. Forty-two percent reported that they had successfully reached their 
goals (ABE=44 percent, ASE= 49 percent, ESL=24 percent). Thirty-eight percent 
reported that their goals had been partially obtained (ABE=36 percent, ASE=28 
percent, ESL=56 percent), and 17 percent claimed that their goals had not been 
met (ABE=16 percent, ASE=21 percent, ESL=14 percent). Secondary analysis 
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showed that goal attainment was positively associated with several program 
variables, including having spent at least one year in the program, being a 
participant at the time of data collection, and having attended classes more 
frequently than the norm. With respect to demographic variables, females were 
more likely to report both successful and unsuccessful goal attainment. Perceived 
goal attainment increased with age. 

Regarding specific outcomes gained, 84 percent of the respondents 
reported that participation had improved their self-concepts (10 percent reported 
no improvement), 75 percent reported that they had improved in reading (22 
percent reported no improvement), 69 percent reported that they had improved in 
mathematics (27 percent reported no improvement), and 66 percent perceived that 
they had improved in writing (29 percent reported no improvement). Fifty-one 
percent believed that their family relations had improved because of participation 
(33 percent reported they had not), 25 perceived that their life skills had improved 
(62 percent reported they had not), and 1 8 percent reported that participation had 
helped them get a job (69 percent reported it had not). 

With respect to future educational plans, 57 percent of those interviewed 
said that they plzinned to enroll or had already enrolled in adult education courses 
and 24 percent reported that they were uncertain of their future educational plans. 
Eighteen percent reported that they plzinned no future education. Fifty-eight 
percent of the respondents said that they plzinned to enroll in schooling other than 
adult education, and 23 percent said they were uncertain about such plans. 

Strengths 

• The 1980 national evaluation is comprehensive, particularly with respect to 
descriptive information. 

• The report is clearly written, and methods and procedures are adequately 
documented. 

Limitations 

• There is a paucity of outconie data. 

• Because of a low response rate to the participant interviews, and because 
respondents were shown to have differed substantially from non-respondents 
in important respects, the outcome data from the 1980 national evaluation 
cannot be generalized to adult literacy education in general wdth any degree of 
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confidence. Because of this limitation, other limitations such as lack of a 
control group and a reliance on self-report data are rendered moot. 

3. A Longitudinal Evaluation of the Adult Basic Education Program, 1973 



Kent, William P. (1973). A longitudinal evaluation of the Adult Basic Education 
Program. Falls Church, VA: System Development Corporation. 



The 1973 evaluation of the federal adult literacy program funded under the 
Adult Education Act commenced during the middle of 1971 and terminated more 
than two years later. As with the 1980 national evaluation and the NEAEP, the 
1971 national evaluation collected a great deal of descriptive information 
regarding the federal program. However, unlike the other national evaluations, 
the 1971 national evaluation focused primarily on learner outcomes. A major 
difference between the 1971 and other evaluations, however, lies in the groups it 
excluded. Because the Adult Education Act at that time restricted instruction to 
adults at the pre-secondary level, the study was limited to learners with eight years 
or less of schooling. For logistical reasons, the study excluded ESL programs, 
migrant programs, and Native American programs, and, because at that time the 
priority population for adult literacy education was defined as learners between 
age 16 and age of 44, persons under 16 and over age 44 were also excluded. 
Although it had originally been planned to establish control groups for the study, 
these plans were abandoned when they proved not to be feasible. 

Because it was impossible to acquire an accurate count of the number of 
students meeting the study’s criteria, it was decided to use states as the primary 
unit in sampling. A two-way stratification protocol was adopted, based on 
grouping states by geographic region and the proportion of black students 
enrolled. The 50 states were classified into an 8x8 matrix and 16 cells were 
drawn from the total “-with each cell’s probability of being drawn proportional to 
the number of students assigned to the cell. . ..Finally one state from each of the 
selected cells was drawn to enter the sample, the probability of being drawn 
proportional to the number of students” (Kent, 1973, pp. 3-8). From each of the 
selected 1 6 states, programs were selected using a proportionate-to-size 
methodology, and from each program, classes were randomly selected so as to 
produce.a sample of approximately 25 students per program. The final sample 
constituted 92 programs, 206 classes, and 2,3 1 8 learners. 
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The 1971 evaluation collected data from a number of sources, including 
state directors, local program directors, and learners. Most of the outcome data 
were collected from a learning gain test and a survey. The learning gain test was 
first administered in January 1972 and again in May 1972. The survey was 
administered in February and March 1972 and then again 12 and 18 months later. 

Tested Learning Gain 

After considering the learning gain tests available for adult literacy 
education in 1972, the 1973 evaluation selected two tests from level M of the Test 
of Adult Basic Education (TABE), one measuring reading comprehension, the 
other measuring arithmetic ftmdamentals. As the final report states: 

The TABE had three levels (“E,” Easy; “M,” Medium; and “D,” Difficult). 

Level E is suitable for grades 1 through 4; M for grades 2 through 9 and D 

for grades 3 through 12. Since M covers all grades of interest to this study 

except 1.0- 1.9, it is nearly satisfactory all by itself. (Kent, 1973, pp. 3-13) 

The validity and reliability of the TABE components used are not reported. After 
developing directions and field testing, tests and instructions were distributed to 
local program directors, and teachers were asked to administer the tests. Only the 
tests from the learners selected in the sample were used. Of the 1, 108 initial tests 
that were obtained, matching pre- and posttests were obtained for 441 subjects. It 
is important to note that, strictly speaking, the tests administered were not pre- 
and posttests since at initial testing learners had already received instruction to 
varying degrees. 

Surveys 

The collection of student interview data was subcontracted to a market 
research firm. Initial student interviews coincided with ABE classes; students 
were excused from class for 20 minutes to complete the interviews. If necessary, 
interviewers returned to class several times to interview learners who had been 
selected as part of the sample. For those learners who could not be reached at 
class times, a sub-sample of half the absentees was established, and those selected 
were interviewed in their homes or other places. At least two attempts were made 
to interview members of the absentee sub-sample. First follow-up interviews 
were conducted a year after the initial interviews. Seventy-four percent of the 
initially interviewed learners were interviewed at first follow-up. Second follow- 
up interviews were conducted 1 8 months after the initial interviews, generally by 
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phone. The response rate was 79 percent. Of the 1,448 students who were 
initially interviewed, 1 ,065 were reached for first follow-up and 844 were reached 
for the second follow-up. It is not known whether the 844 learners who were 
interviewed after 18 months differed substantially in important ways from the 604 
learners who were originally interviewed but could not be reached 1 8 months 
later. 

Findings 

Learning Gain 

When initially tested with components of the TABE, learners scored at 
grade level 5.4 on reading achievement and 6.4 in mathematics (raw scores were 
not reported). After the second administration of the test approximately four 
months later, in which a different test form was used, 26 percent of the students 
had gained one grade or more in reading, 41 percent had some gain, but less than 
one grade, and 33 had zero or negative gain. In mathematics, 19 percent gained 
one or more grades, 46 gained some, but less than one grade, and 35 percent 
showed zero or negative gain. The proportions of those who gained and those 
who did not may have been affected by the differing hours of instruction learners 
had amassed between the first and second test administrations. While almost a 
fifth of the learners had 39 hours or less of instruction between the first and 
second testing, another fifth had 80 or more hours of instruction. Average gains 
for reading were .5 grades after 98 hours and .4 grades after 66 hours. For 
mathematics the comparable figures were .3 grades both after 98 hours and after 
66 hours. 

In the 1 973 study, learners v^th the lowest initial scores tended to show 
the greatest gains. In reading, those whose initial scores were below the fifth 
grade showed average gains of .8 grade levels, those whose initial scores were at 
the fifth or sixth grade demonstrated gains of .3 grade levels, and for those who 
initially scored at the seventh grade or higher the average reading gain was 0 
grade levels. For mathematics the pattern weis similar. This is an apparent 
contradiction with the NEAEP which, using multiple regression, found that ABE 
learners’ pretest scores accounted for 61 percent of the variance in posttest scores. 
There are many factors that may explain this contradiction. One explanation is 
that, while the NEAEP carefiilly screened its cases to eliminate those that were 
likely to have been tainted by floor and ceiling effects, if the 1973 evaluation did 
so, it is not reported. Thus the findings of the 1973 evaluation may have been 
unintentionally biased by both floor and ceiling effects. The ceiling effect 
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hypothesis is supported by the evaluation’s finding that, despite the intended 
exclusion of adult secondary-level learners, 54 percent of the sampled clients had 
completed nine or more grades of school. As with the NEAEP, the 1973 
evaluation showed no linear relationship between hours of attendance and tested 
learning gains. 

Females gained more than men, there was no clear relationship between 
learning gains and race and age, and there was no consistent pattern between 
learning gains and previous school experience. 

Further Education 

When they were initially interviewed, 60 percent of the learners reported 
that they might attend college at some time and 70 percent reported that they 
might attend vocational school at some time. However, over time, the 
percentages decreased substantially for college attendance. At follow-up, only 37 
percent said that they might enroll in college and 65 percent said they might 
attend vocational school. 

Helping School-age Children 

When they were initially interviewed, 55 percent of the learners reported 
that they had helped children with schoolwork; at follow-up 58 percent said they 
had helped children with schoolwork. 

Employment and Earnings 

Assessment of the impact of adult literacy education on employment and 
earnings was a major objective of the 1973 national evaluation. When initially 
interviewed, 55 percent of the learners reported that they were working, 26 
percent were receiving public assistance, and 58 percent reported that they had 
some earnings. A year later, 63 percent were working, 24 percent were on 
welfare, and 66 percent had some earnings. Eighteen months later, 65 percent 
were working, 22 percent were on welfare, and 70 percent had some earnings. 
Overall, there was a 10 percent increase in employment over 18 months and a 12 
percent increase in the incidence of some earnings. For those who were employed 
at the time of initial interviews, over 1 8 months, mean monthly earnings increased 
from $336 to $407, mean hourly earnings increased from $2 to $2.36, and mean 
hours of hours worked per week increased from 37.1 to 39.1 . Thus for 1 8 months, 
mean monthly earnings increased 21 percent. Extrapolating to the entire adult 
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basic education population, the 1973 evaluation reported that the increases would 
have amounted to $46 million per year (pp. 2-24) 

The gains in earnings reported by the 1973 evaluation are very large. 
Acknowledging that inflation and wage fluctuations may have accounted for 
perhaps 5 to 6 percent of the increase, the 1973 evaluation reported that “in 
analyzing the gains indicated by the foregoing figures, several considerations 
suggest that the gains are real rather than a product of selective interviewing, 
increases in hours of work, or inflation” (Kent, 1973, pp. 2-23). Unfortimately, 
the only way to determine with confidence whether these gains were real would 
have been to compare the gains of participants with a control group whose 
members differed from participants only with respect to participation in adult 
literacy education. Although such a control group was planned for the evaluation, 
it was never implemented. 

Learners who had received pay increases were asked how much they 
thought ABE had helped in getting the pay increase. Fifteen percent responded 
very much, 20 percent responded some, 12 percent responded a little, and 52 
percent responded not at all. Thus even if the large increases in pay reported by 
the 1973 study were real, almost two thirds of the respondents perceived that 
participation in ABE played either no role or a minor role in obtaining the 
increases. 

With the exception of the tested learning gain data, most of the outcome 
data is based on learners’ self-reports. This interjects a source of bias that 
becomes apparent in the data reported on attendance. Learners were asked, 
“During which months did you attend the adult basic education at least one time 
during the month?” When these data were compared to teachers’ reports, there 
were wide discrepancies between students’ and teachers’ reports and these 
discrepancies tended to increase over time. In November 1971, 81 percent of the 
students reported they had attended at least once, while 85 percent of the teachers 
reported that the student in question had attended at least once. In June of 1972, 
however, although 38 percent of the students reported attendance, only 17 percent 
of the teachers reported attendance. 
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Strengths 

• As with the NEAEP and the 1980 national evaluation, the 1973 national 
evaluation was comprehensive in scope. 

• The report is clearly written and methods and procedures are adequately 
.explained. 

• The outcome data is longitudinal. 

/ 

Limitations 

• Because of age and grade level exclusions in the 1973 evaluation, comparison 
with other national evaluations is problematic. 

• Because the tests used for tested learning gain were extrapolated from the 
TABE, and because a single level of the TABE was used for all respondents, 
the validity and reliability of the test data are issues. 

• As with the NEAEP there was some program attrition from the study and 
there was substantial respondent attrition. Although the study attempted to 
compensate, the extent to which attrition compromised the proportionate-to- 
size sampling plan is an issue. If the sampling plan was seriously 
compromised, the generalizability of the findings is questionable. 

• Respondent attrition leads one to question the internal and external validity of 
the outcome findings. 

• Posttests were administered approximately four months after pretests. 

Whether this is a reasonable time over which to expect meaningful learning 
gains is an issue. 

• With the exception of the test data, outcome findings are based on learner self- 
report. 

• The study lacked a control or comparison group. 

4. The 1968 Evaluation 



Greenleigh Associates, Inc. (1968). Participants in the field test of four adult 
basic education systems: A follow-up study. New York: Author. 



The earliest evaluation of adult basic education that could be considered 
national in scope was conducted by Greenleigh Associates prior to the advent of 
the Adult Education Act. At that time, the federal adult literacy education 
program was administered by the United States Office of Economic Opportunity. 
The evaluation, which was conducted between June 1966 and January 1968, is the 
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follow-up to a field test of adult basic education programming conducted from 
March 1965 to May 1966 in three states— New York (three counties), New Jersey 
(two counties), and California (one county). Participants in the field test, who had 
been selected to assure representation on age, gender, and ethmc background, had 
all scored below the fifth-grade level on a standardized test. Initially 540 
participants were randomly placed in 36 classes. 

With respect to the outcomes portion of the evaluation, the objectives were 
to determine the extent to which reading skills learned during the field test had 
been retained and to assess the effects of the educational experience provided in 
the field test on income, housing, family relations, health, and motivation. 

In order to permit comparisons, participants in the follow-up study were 
assigned to one of three groups based on the following characteristics. 

Participants (Group I) were those who had completed 17 weeks of adult basic 
education in conjunction with the field test. Non-participants (Group II) were 
those who qualified to participate by virtue of the fact they had scored below 
grade level 5 on the reading test but had dropped out during the first two days or 
had declined to attend classes. It should be noted that Group II did not constitute 
a true control group since there was no random assignment to treatment and 
control groups involved. In fact. Group II members obviously differed from 
Group I participants with respect to their motivation and/or ability to participate in 
adult basic education. As the authors suggest, it was possible that while “Group I 
to a significant extent were continued in adult basic education following the field 
test. . .Group II were pressured to enter basic education or seek employment with 
the threat of dismissal from welfare for noncompliance. Apparently, Group II 
persons sought and obtained employment rather than engage in education 
(Greenleigh Associates, 1968, p. 30).” The third group (Group III), designated as 
“overqualified,” were those who did not participate in the field test since they had 
scored above 5.9 on the standardized reading test. 

Two methods were used to collect data — a series of two interviews 
conducted 6 and 12 months after the field tests and a standardized reading test, the 
Gray’s Oral Paragraphs Test. The test was administered twice following the 
interviews. 

Five to six interviewers, most of whom were trained social workers, were 
recruited for each state. All received one week’s training prior to the interviews 
and administration of the test. For the most part, interviews and tests were 
administered in subjects’ homes. Although it took an average five to six contacts 
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to produce a single interview, during the first set of interviews 1 ,641 cases were 
obtained from a sample of 2,003. Three hundred and sixty-two of those who were 
scheduled for first interviews ultimately could not be reached. During the second 
round of interviews 1,425 cases were obtained and 216 of those scheduled for 
interviews could not be reached. It is likely that the evaluation was affected by 
attrition bias. Although the universe was 15 percent white, 14 percent male, and 
66 percent over age 30, the first interview participants were 20 percent white, 1 8 
percent male, and 73 percent over age 30. Of those who were interviewed in both 
the first and second interviews, 1 8 percent were male and 74 percent were over 
age 30; of those who were administered the first interview, but not the second, 23 
percent were male and 66 percent were over age 30. Qualitative as well as 
quantitative data were collected during the interviews, and the qualitative data are 
used to elaborate and refine the quantitative findings reported. 

Upon the recommendation of the New York State Education Department, 
the Gray’s Oral Paragraphs Test was used as the reading test for the study. 

Because this test was designed for, and normed on, children, the validity of the 
test for an adult population is an important issue. More appropriate, adult- 
oriented reading tests were simply not available at the time of the study. 

In an attempt to verify the data gathered in interviews, caseworkers were 
also interviewed. However, this component of the evaluation was fraught with 
problems. As the authors note; 

In many cases case workers were completely unfamiliar with their clients, 
had never seen them, and were obviously unacquainted with the field 
record. This occurred because turnover among caseworkers was unusually 
high, and reorganization often caused reshuffling of case loads. 
(Greenleigh Associates, 1968, p. 19) 

Findings 

Economic Impact 

Most participants in the study experienced no change in their welfare 
status between the first and second interviews (Group 1=86 percent. Group 11=77 
percent. Group 111=86 percent). A small number (Group 1=2 percent. Group 11=3 
percent. Group III=2 percent) of those who were not initially on welfare became 
welfare participants. A majority of those in each group who were not on welfare 
during the first interview were still not on welfare during the second interview. 
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Only a small group who were on welfare during the first interview had been 
removed from welfare by the second interview (Group 1=5 percent. Group 11=5 
percent. Group 111=5 percent). 

Qualitative data collected during the interviews suggested that: 

According to the interviewers in each state during both interview periods, 
the overwhelming majority of the study population asserted their desire to 
gain their independence from welfare. However, they could not envision 
how to accomplish this goal. The move into gainful employment required 
the solution of critical difficulties, in most cases related to problems of 
adequate child care. (Greenleigh and Associates, 1968, p. 27) 

The employment rate for participants (Group 1) at the time of the first 
interviews was 17 percent while the employment rate for this group at the second 
interview was 20 percent. Nine percent of the participants in adult basic 
education who were unemployed during the first interview gained employment by 
the second interview (Group 11=4 percent. Group 111=9 percent). Ten percent of 
participants were employed during both periods (Group 11=17 percent. Group 
111=1 1 percent), and 75 percent were unemployed during both periods. Six 
percent of those participants who were employed during the first interview 
became unemployed by the second (Group 11=9 percent. Group 111=5 percent). 

The authors caution that, because Group 1 members were em-olled in basic 
education, they were not necessarily available for employment. 

Social Impact 

The 1968 evaluation showed substantial impact on social participation 
variables for all groups. In respect to reported participation in community 
organizations, at the time of the first interview 12 percent of participants reported 
that they had participated in community organizations (Group 11=7 percent. Group 
111=1 7%), while at the time of the second interview, 3 1 percent of the participants 
reported participation in community organizations (Group 11=22 percent. Group 
111=3 1 percent) for a net gain of 19 percent for participants. 

Education 

At the time of the first interview, 20 percent of those initially designated 
as participants in adult literacy education (Group 1) were still attending, 44 
percent had completed the program, and 16 percent had dropped out. Of those 
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who participated in a first interview, 67 percent were still attending a program, 9 
percent had completed, and 24 percent had dropped out by the time of the second 
interview. Participants were asked in what areas they had gained as a result of 
participation, and, of the 1 0 options listed, the greatest gains were reported in 
reading skills (29 percent first interview, 23 percent second interview), arithmetic 
skills (21 percent first interview, 19 percent second interview), and self- 
confidence (13 percent first interview, 1 1 percent second interview). 

Initial scores obtained at the beginning of the field test on Gray’s Oral 
Reading Paragraphs Test were 2.9 for participants (Group I), 2.9 for non- 
participants (Group II), and 8.0 for the overqualified group (Group III). During 
the first interviews, approximately 12 months later, scores were 4.0 for group I, 
4.0 for Group II, and 8.4 for group III. Although the test scores may well have 
been biased by attrition effects and the validity and reliability of the test are 
significant issues, it appears that those who participated in adult literacy education 
after the field test did not score higher on the test that those who did not 
participate. At the end of the second interviews, which took place 6 months after 
the first, participants scored 4.0, non-participants scored 4.0, and Group III scored 
8.2. Thus again there was no appreciable gain between participants and non- 
participants. It was also found that the level of teacher preparation, defined as 
high school graduate, college graduate, or certified teacher, had no effect on 
learning gain. Likewise, the type of reading system used (American Incentive to 
Read, Science Research Associates, Mott Basic Language Skills Program, 
Systems for Success) had no effect on learning gain. 

Strengths 

• The 1968 evaluation is comprehensive with respect to the variables included 
and the n is large. 

• The report is clearly written and is well documented with regard to methods 
and procedures. 

• The study is longitudinal. 

• Qualitative data enhance and enrich the report. 

• Comparison groups were included. 

Limitations 

• Because the adult literacy education program as constituted in 1 968 differs 
markedly fi'om the program as constituted today, findings cannot be 
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generalized to the present. Because the states and counties studied were not 
representative of the imiverse, findings cannot be generalized to adult literacy 
in general in 1968. 

• The comparison groups are not true control groups. Because Group II, 
designated as non-participants, was drawn fi'om those who elected not to 
participate in classes or dropped out, this group may be biased at least with 
respect to its motivation and/or ability to participate in adult literacy 
education. 

• The test used, the Gray’s Oral Reading Paragraphs Test, was designed for, and 
presumably normed on, children. As such its validity for adults is 
questionable. 



STATE STUDIES 

Although since the late 1960s many states have published input-output 
studies describing their adult literacy education programs, few states have 
conducted formal outcome assessments. Only nine state outcome studies were 
assessed as being sufficiently credible for inclusion in this report: California, New 
Jersey, Maryland, Ohio, Tennessee (two studies), Utah, Washington, and 
Wisconsin. Studies will be reported in chronological order, the most recent listed 
first. 



5. The Washington Workforce Training Study 



Washington State Training and Education Coordinating Board. (1997). Worlrforce 
training results net impact and benefit cost evaluation. [Meeting NO. 52, March 
25, 1997]. Author. 



Because the final report of the Washington Workforce Training Study was 
not available at the time of this review, the above preliminary report was used for 
our analysis. The purpose of the Washington study was to determine whether the 
outcomes for participants in workforce training programs were similar to the 
outcomes for those who did not participate in training. Workforce training 
programs were operationally defined as post-secondary training at community 
colleges and technical colleges, adult basic skills education for participants who 
enrolled for work-related reasons. Job Training Partnership Act (JTPA) Title Ila 
programs for disadvantaged adults, JTPA Title lie programs for youth, and 
secondary vocational-technical education. The analysis here will focus on the 
adult basic skills program. 



NCSALL Reports #6 



January 1999 



The study, which was conducted by the Battelle Memorial Institute, 
employed a matched comparison group analytic design. To examine both short 
and longer term effects, two groups of program completer/leavers were studied — 
those who exited during 1991-1992 and those who terminated in 1993-1994. 
Sample selection began with the acquisition of lists of all program 
completer/leavers for the years studied. As the report acknowledges, this was 
problematic for adult basic education since it was impossible to separate 
completers from dropouts. Hence, completer/leaver for adult literacy was defined 
as anyone who stopped attending and did not return for at least one year. 

The selection of a comparison group began with a list of employment 
service registrants who had not participated in any of the workforce education 
programs studied. Through a logistic regression procedure, comparison group 
members were selected who were “equivalent” to treatment group members with 
respect to age, race, ethnicity, gender, education, region, employment history, 
earnings, and receipt of welfare and unemployment insurance. Although the 
study’s authors reported that this procedure was generally successful in matching 
treatment group participants with comparison group members, the match for the 
1993-1994 cohort of adult basic skills participants was less successful due to a 
relatively small pool of potential comparison group members. 

Data for the study were obtained from records compiled by the workforce 
programs in which participants enrolled, the state employment service, and 
welfare and unemployment insurance offices. Data were collected in 1995. This 
represented a seven- to nine-month period between termination from the adult 
literacy education program for the 1993-1994 cohort and a three-year period for 
the 1991-1992 cohort. 

It is important to note that the adult literacy education subjects of the study 
had all been enrolled in programs sponsored by community and technical colleges 
and had all enrolled for reasons related to work. This included only about one 
third of the total of adult literacy education students (including ESL) served by 
Washington’s community and technical colleges. In addition, the study was 
limited to those who had enrolled only in basic skills and, therefore, it excluded 
learners who had also enrolled in vocational education. Finally, as the report 
mentions, construction of an adequate comparison group for adult basic skills was 
difficult because there was no information about the basic skills levels or English 
language proficiency of comparison group members. 
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Findings 

Seven to nine months after exiting from adult literacy education, 52.8 
percent of the comparison group were employed and 44.9 percent of the 
participant group were employed, the difference being 7.9 percent in favor of the 
comparison group. Three years after leaving adult literacy education, 49.4 percent 
of the comparison group were employed and 45.7 percent of the participant group 
were employed for a difference of 3.7 percent, again in favor of the comparison 
group. Participation in adult literacy education was, therefore, found to have a 
negative effect on employment. 

Seven to nine months after terminating the adult literacy education 
program the mean hourly wage for participant group members who were 
employed was $8.22, while for comparison group members hourly earnings were 
$7.90, the difference being $.33. Three years after termination, participant group 
members were earning $9.00 and comparison group members were earning $9.05. 
The difference of $.05 was not statistically significant. Adult literacy education 
had a small positive short-term effect and no long-term effect on hourly earnings. 

In the short-term, former participants worked an average of 355 hours per 
quarter and comparison group members worked 326 hours for a difference of 29 
hours in favor of the participant group. In the long-term, participant group 
members worked 393 hours and comparison group members worked 363 hours 
for a difference of 30 hours in favor of the participant group. Participation in 
adult literacy education had a positive effect on hours worked. 

With respect to mean quarterly earnings, seven to nine months after 
leaving the adult literacy program, participant group members earned $2,994 and 
comparison group members earned $2,635 for a difference in favor of the 
participant group of $359. Three years after termination from the program 
participant group members earned $3,653 and comparison group members earned 
$3,361 for a difference of $292 again in favor of the participant group. 
Participation in adult literacy education had a positive effect on quarterly 
earnings. 

In the short-term, 9.3 percent of the participant group received AFDC, 
23.2 percent received food stamps, and 23.9 percent received medical benefits. 
The figures for comparison group members were 6.3 percent for AFDC, 19.2 
percent for food stamps, and 18.5 percent for medical benefits. The differences 
between the groups were 3 percent for AFDC, 4 percent for food stamps and 5.4 
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percent for medical benefits. In all cases, the incidence of public assistance was 
higher for participant group members than for comparison group members. Three 
years after termination, 6.8 percent of the participant group members received 
AFDC, 1 8 percent received food stamps, and 1 8.9 percent received medical 
benefits. For comparison group members, the percentage for AFDC was 5.9, the 
percentage for food stamps was 18.8, and the percentage for medical benefits was 
17.8. Although the small difference of .9 percent was statistically significant for 
AFDC, the differences for food stamps and medical benefits were not significant. 
Participation in ABE had a negative effect on receipt of public assistance in both 
the short and long term. 

With respect to benefits and costs, the report states that “the ratio of 
participant benefits to program costs is therefore, without considering impacts on 
social-welfare expenditures, $416 to $1,261” (p. 11). 

Strengths 

• The report is sufficiently clear and detailed with respect to methods, 
procedures, and findings. 

• Direct measures, rather than learner self-report, were used. 

• The study includes a relevant comparison group. 

• The sample size is large and adequate for the analyses performed. 

• The study measures both long- and short-term effects. 

Limitations 

• Although the use of a comparison group enhances the study, the comparison 
group is not a true control group. As the report acknowledges, the matching 
procedures for adult literacy education were imperfect because the adult 
literacy rate for the comparison group was not known. As a result, it is not 
known for certain whether the differences between the participant and 
comparison groups can be attributed to the effects of adult literacy education 
or should be attributed to the non-equivalence of the participant and 
comparison groups. 

• Data were collected from program records. The report acknowledges some 
problems with employment and earnings data gathered from state 
unemployment insurance records and data collected from welfare records. 

• The participant group was comprised of leaver/completers and the study does 
not distinguish between the two. Thus the proportion of successful completers 
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of adult literacy education to learners who dropped out before their goals had 
been achieved is not known. 

6. The Tennessee Longitudinal Study 



Merrifield, J., Smith, M., Rea, K., & Shriver, T. (1994). Longitudinal study of 
adult literacy participants in Tennessee: Year two report. Knoxville, TN: Center 
for Literacy Studies, University of Tennessee. 



Merrifield, J., Smith, M., Rea, K., & Shriver, T. (1993). Longitudinal study of 
adult literacy participants in Tennessee: Year one report. . Knoxville, TN: Center 
for Literacy Studies, University of Tennessee. 

The Tennessee Longitudinal Study was one of the most ambitious state 
studies, and, had its ambitions been realized, it might have been the best state 
study yet conducted. The goals of the five-year effort were: 

To expand our understanding of how participation in literacy programs 
changes adults’ quality of life... .2. To examine the influence of 
community and programmatic contexts of the individuals in the study, 
within which they change skills, perceptions and attitudes, and to explore 
the meaning for them of these changes, 3. To provide findings for policy 
makers and program developers to inform development of future adult 
basic skills programs. (Merrifield et al., 1993, p. 9). 

The study was longitudinal in design, and a qualitative component was planned to 
examine changes in the individual, community, and program contexts. 

The study focused on ABE level one participants, those who had scored 
below grade level 5.9 on the ABLE Test. The sampling plan, which used a paired 
comparison method, began with the selection of program sites. Six demographic 
variables were identified as being relevant to the study: percent non-white, percent 
of families living in poverty, percent population change, percent high school 
graduates, percent urban, and median years of school completed. Tennessee 
counties were then sorted into three demographic regions by rural/urban. This 
produced six sets of counties; ■ Then the means for the six demographic variables 
were computed for each of the six sets of counties. Counties with the most 
variables that fell within a half standard deviation fi-om the means on the six 
variables were selected as possible sites. Next, the number of level one ABE 
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students for each potential county site was computed as a step toward developing 
a sample of about 240 learners. It was found, however, that only the larger urban 
counties had sufficient numbers of level one students to yield a sample of this 
size. As a result, some smaller counties were combined. Eight county research 
sites were thus identified. 

The data collected in the first year of the study were to serve as a baseline 
for longitudinal follow-up data collections in subsequent years. Data were 
collected through personal interviews conducted by interviewers trained by the 
project. One hundred and thirty-three interviews were obtained, well short of the 
240 interviews intended. Several reasons for the failure to obtain interviewing 
goals were posited: 

Several of the sites reported increases in the number of ABE Level-2 and 
ASE new entrants, but reported few new ABE Level- 1 entrants in that 
year. Since only those scoring at 5.9 or below in reading qualified for the 
study, not all of the ABE Level- 1 students qualified for the study, our 
initial projections (based on ABE-1 totals) were probably optimistic. We 
also excluded some students who were in correctional facilities, nursing 
homes and other students aged 75 and over, and students who were 
mentally retarded. (Merrifield et al., 1993, p.l6) 

Although the 1 33 learners who were included in the baseline cohort were 
sufficient for meeting the first year’s basically descriptive project goals, it was 
recognized at the outset that, given expected attrition in the baseline cohort, 
sufficient numbers of subjects might not be available in future years to meet the 
goals of the longitudinal study. 

In the second year, longitudinal comparisons were to be made between the 
baseline cohort for year one and the same subjects a year later. However, as 
expected, the Tennessee study experienced significant attrition between years one 
and two, and was left with a sample si 2 » of 70 in the second year. Most of the 
attrition came from one large urban site, thus skewing the second year sample. 
Moreover, comparisons between the first and second year data showed that the 
composition of the second year sample had changed substantially as a result of 
attrition. While for the first year sample 30 percent of the subjects resided in East 
Tennessee, in the second year 44 percent resided in East Tennessee. Other 
variables for which there were differences between years included race (first year 
blacks=58 percent, second year blacks=47 percent) and employment (first year 
employed=33 percent, second year employ ed=46 percent). 
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Unfortunately, the amount of attrition and the significant differences 
between the year one and year two data render meaningful inferences from the 
data problematic. To compound the problem, the sample size for the second year 
data was so small that many important analyses were impossible. Although it was 
intended that the follow-up data for the study would be collected one year after 
the baseline data, in actuality follow-up data were collected between 1 2 and 20 
months after the baseline data. Thus the time interval between baseline data 
collection and follow-up data collection differ. 

The Tennessee Study used two instruments to collect data. The first was a 
1 17-item focused interview guide that included variables pertaining to socio- 
economic well-being, social well-being, personal well-being, and physical well- 
being. The survey was well-conceived, detailed, and comprehensive. Yet, 
because of its length, the instrument required about an hour to administer. To 
measure self-esteem, the Rosenberg Self-Esteem Scale was used. The Rosenberg 
is of acceptable validity and reliability and has been used in other studies of adult 
literacy education. It is one of the shorter self-esteem instruments. 

Findings 

The outcome findings, which pertain to changes in the first year cohort 
over a year’s time, are found in the second year report (Merrifield et al., 1994). 

Learners ’ Assessments of Their Participation and Additional Education 

Ninety-one percent of the respondents reported that participation in adult 
literacy education had made a difference for them or helped them achieve their 
goals. Forty-nine percent reported that participation had made a difference in 
reading, writing, and math. Findings for “making a difference” in the other areas 
listed were: learning in general, 18 percent; everyday literacy skills, 13 percent; 
self-confidence, 10 percent; getting/improving a job, 8 percent; helping children, 

1 percent; being with people, 1 percent. Seventeen percent reported that since 
enrolling in ABE classes they had enrolled in other educational activities. 

Employment 

The overall employment gain for the group was 9 percent. Of those who 
were unemployed and looking for work when they enrolled in the ABE program, 
almost half had secured employment by the second year interview. However, 20 
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percent of those who were employed in year one had lost their jobs by the second 
year. As the second year report states: 

The bad news is that all those who gained a job left the ABE program. 

That is bad news for a couple of reasons. First, for the most part they did 
not participate long enough to be likely to make great skill gains, given 
that they were reading at the fifth grade level or below. Second, the jobs 
they got were not very good jobs. Average wages were only $4.87 per 
hour, worse even than the wages of those employed at the baseline ($6.07 
per hour). (Merrifield et al., 1994, p.54) 

Self-Esteem 

Self-esteem was measured by the Rosenberg Self-esteem Scale. Small but 
statistically significant changes were found in self-esteem between the baseline 
and follow-up data collections (baseline mean=3.63, follow-up mean=3.83, 
difference=.20). Differences were lower for those who were no longer active in 
the program (.16) than for those who were still active (.28). Differences were the 
least for those with minimal participation (-.13), greatest for those with moderate 
participation (.32), and in between for those with substantial participation (.27). 

Having become recently employed proved to be an important factor in 
explaining differences in self-esteem. The difference in Rosenberg scores for the 
recently employed was .3 1 , while the difference for the continued employed was 
.22, for the unemployed looking for work .21, and for the unemployed not looking 
for work .15. Race was not an important factor in changes in self-esteem. 
Women’s self-esteem (difference=.3 1 ) increased substantially more than men’s 
(.07). 



In addition to receiving the Rosenberg instrument, participants were asked 
if their feelings about themselves had changed. Seventy-seven percent reported 
that they felt better about themselves, 20 percent said they felt the same, and 3 
percent reported that they felt worse. Thirty-nine percent reported that being able 
to read better caused the change and 29 percent reported that attending the ABE 
program had caused the change. 

It should be noted that there is a discrepancy between the changes 
recorded on the Rosenberg Scale and the changes in self-esteem reported by 
learners. While the changes recorded by the Rosenberg are quite small, the self- 
reported changes are quite large. 
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Marriage, Family, Community Involvement, Everyday Literacy, and 
Health 

No significant changes were found in positive marital relations between 
baseline and follow-up. Changes were found with respect to activities with 
children. Although helping children with school work decreased 5 percent 
between baseline and follow-up (44 percent baseline, 39 percent follow-up), 
visiting a child’s teacher four or more times increased from 28 percent to 46 
percent. Checking on a child’s progress in school increased from 50 percent to 75 
percent. Attending school activities increased from 50 percent to 60 percent, and 
attending four or more school activities increased from 40 percent to 61 percent. 

Although voter registration increased 5 percent between baseline and 
follow-up, the percent that voted in the last presidential election decreased 8 
percent. Seventy-nine percent of those who reported that they had changed in the 
way they felt about their community reported that the change in feeling was 
positive. Of the eight questions related to community involvement, only one 
showed significant changes (“as a citizen I can bring about needed change in 
government”), and that change was negative (42 percent decrease between 
baseline and follow-up). 

Changes in attitudes toward literacy increased and, of the 84 percent who 
reported that they had seen changes in their everyday literacy usage, 5 1 percent 
noted increases in reading and writing. Use of the public library increased, as did 
incidence of visits to the local public health clinic. 

Strengths 

• In design, the study was sound. Because of the longitudinal design, long-term 
as well as short-term changes could have been ascertained had the study been 
conducted as planned. The qualitative component afforded the possibility of 
yielding new insights into the meanings learners ascribe their participation in 
adult basic education. The sampling plan was adequate to produce findings 
with external validity. 

• The survey was detailed and comprehensive. Variables were sufficiently 
numerous and relevant to permit detailed and sophisticated analyses. 

• The reports are clear and well documented. 
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Limitations 

• Because funding for the study was withdrawn before its completion, complete 
follow-up data from planned data analysis cohorts were not collected. More 
importantly, the qualitative component was never implemented. 

• Learner attrition from the study over time resulted in an unacceptably small 
sample size for the follow-up samples. This precluded detailed and 
sophisticated analyses. Learner attrition may also have resulted in biased data 
for the analysis of changes that occurred over time. 

• The time period between the baseline and follow-up data collections varied 
significantly in duration. 

Like the NEAEP, the Tennessee Longitudinal study is an excellent case 
study of what can go wrong with well-intentioned and well-designed outcome 
studies in adult literacy education. 

7. The California Adult Learner Progress Evaluation 

(CALPEP) 



Solorzano, R. (1989a). An analysis of learner progress from the first reporting 
cycle of the CALPEP field test: A report to the California State Librarian. 
Pasadena, CA: Educational Testing Service. 



Solorzano, R. (1989b). An analysis of learner progress from the second reporting 
cycle of the CALPEP field test: A report to the California State Librarian. 
Pasadena, CA: Educational Testing Service. 



CALPEP was an outcome assessment of the California Literacy Campaign 
conducted by the Educational Testing Service. It differs from the previously 
reported outcome evaluations in two important respects. First, the California 
Literacy Campaign is a tutor-based adult literacy education program conducted 
under the auspices of the California State Library System. Hence all outcomes are 
the product of one-on-one tutoring. Secondly, the variables in the study relate 
almost exclusively to reading and writing behavior. They include reading habits, 
writing habits, reading levels relative to reading goals, writing levels relative to 
writing goals, overall reading level, overall writing level, learners’ perception of 
reading progress, effect on job status, and learners’ reasons for leaving the 
program. 
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Data were collected with an intake form entitled “Where We Started.” 
Follow-up versions of the intake form were administered semi-annually and at 
exit from the program. Tutors and learners completed the forms together. Data 
were reported only for learners with matched data, that is, learners who completed 
both intake and follow-up forms. Data were collected from 53 of the 65 libraries 
participating in the California Literacy Campaign. This represented 733 learners 
from whom 354 matched sets of data were obtained. The amount of time that the 
354 learners in the study had participated in tutoring ranged from three to five 
months. 

The first CALPEP report (1989a) presents data from the first six months 
of the assessment. The second report (1989b) reports data from the first year of 
the assessment. Findings are drawn from the second report, which reports 
changes over the duration of a year. 

Findings 

Reading frequency was measured as the frequency that 14 types of reading 
material were read. Differences between baseline and follow-up were recorded as 
follows: less often than at baseline measurement, more often, or the same. For the 
majority of learners, reading frequency did not change, but for those whose 
frequency did change, reading frequency generally increased rather than 
decreased. More than 20 percent of the learners increased their frequency of 
reading books, mail/bills/letters, labels and instructions, TV guides, newspapers, 
and magazines. Changes in the difficulty learners experienced in reading 
materials were also reported for the 14 types of reading material. Again, the 
largest group demonstrated no change in difficulty in reading. However one 
quarter of the learners reported less difficulty reading mail/bills/letters, 
labels/instructions/work-related materials, and books. 

For changes in overall reading behavior, learners were asked to indicate 
how often they read materials in general outside the tutoring sessions. Response 
options ranged from “a few minutes” to over four hours per week. As the report 
states; 



In general, learners at the lower-frequency end increased their outside 
reading habits while those at the upper end did not demonstrate much 
change in their outside reading habits. In effect, a plateau seemed to be 
reached after a certain amount of time. (Solorzano, 1989b, p. 10) 
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General reading frequency increased over time. As with reading behavior, the 
majority of learners reported no change in writing behavior over a year. 

CALPEP attempted to measure learners’ self-perceptions of their reading 
and writing growth by the changes in the responses to five items that were 
assumed to measure improvement. For reading growth the response options were: 
I can’t read; I can read, but only simple things; I can read, but I can’t understand; I 
can read, but not under pressure; I can read, and I like to read. For writing the 
items were: I can’t write; I can write, but just letters and words; I can write, but 
only simple things; I can write, but I can’t spell; and I can write, and I like to 
write. For reading, more learners perceived they had improved than perceived 
that they had stayed the same or regressed. For learners who initially said they 
could not read, at follow-up 56 percent responded with “I can read, but only 
simple things” and 20 percent responded with “I can read and I like to read.” 

For writing, of those who initially indicated they could not write, 36 percent 
improved to writing letters and words, and 20 percent improved to writing simple 
things. Seventy-three percent of the learners reported that participation in the 
California Literacy Campaign had helped them in their jobs. 

Strengths 

• The assessment is sufficiently simple in its instrumentation and administration 
to be used effectively in a tutor-based context. 

• The report is adequately documented with respect to procedures and methods. 

Limitations 

• The findings are based on learner self-report. 

• The findings pertain to tutor-based programming. Because tutor-based 
programming is generally one-on-one in orientation and because tutors are 
generally less well-trained than regular adult literacy education instructors, the 
study cannot be generalized to adult literacy education in other contexts. 

• Matched sets of data could be obtained from less that 50 percent of the 
sample. 

• The variables included in the study are based narrowly on reading and writing 
behavior. Thus no data are available on personal and social impact. 

• The findings of the report are portrayed primarily in graphic format, and the 
few tables presented are difficult to interpret. 

• Lack of a control or comparison group makes it difficult to infer causality. 
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8. The New Jersey Study 



Darkenwald, G., & Valentine, T. (1984). Outcomes and impact of adult basic 
education. New Brunswick, NJ: Rutgers University, Center for Adult 
Development. 



The goals of the New Jersey Study were: 

1. To determine the impact of adult Basic Education in New Jersey in 
terms of (a) attainment of students’ own goals for participation and (b) 
program effects on tangible indicators of social and economic well being. 

2. To ascertain the nature and impact of costs and benefits for New 
Jersey’s adult high school completion program, including a comparison of 
the GED and adult high school options. 3. To design a model, 
instrumentation and procedures for ongoing statewide student follow-up. 
(Darkenwald & Valentine, 1984, p. 2) 

Relevant outcome variables were identified through a review of the 
literature and through consultation with the project’s advisory board. Because of 
cost limitations, the study excluded ESOL and special populations such as 
prisoners and the mentally retarded. 

For the outcome component of the study, a random sample was selected 
using the probability proportionate to size model. Seeking a sample of about 400 
students enrolled in 10 programs, the enrollments of all New Jersey programs 
were divided into clusters of 40 students and each cluster was assigned a number. 
Then 10 clusters of 40 students were randomly selected from all the clusters. 

Nine programs were thus represented. One program, a very large one, was 
selected twdce, and 80 students were entered into the sample rather than 40. 

Once programs were selected, permission to collect data was sought and 
all programs agreed. Then, for each program, a list was compiled of learners who 
had enrolled no later than October 1983 and had achieved 12 hours of instruction 
by November 5, 1983. From the lists of eligible students, 40 were randomly 
selected from each program (80 from the program selected twdce). Data were 
collected in. April/May 1984; thus all learners had been enrolled seven to eight 
months at data collection. Data were collected through telephone interviews using 
a 27-item survey that included several open-ended questions that were 
subsequently inductively coded. The survey was field-tested with groups of 10 
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students and modified accordingly. Two hundred and ninety-four responses were 
received for an unadjusted response rate of 74 percent. When invalid cases were 
removed from the sample (e.g., unlisted or no phones) the adjusted response rate 
was 97 percent. 

The benefits of the high school completion component of the study 
included two groups: 1 . learners who had participated in publicly-funded adult 
literacy programs and had achieved high school certification by passing the GED 
tests and 2. learners who had completed the New Jersey Adult High School 
program. Through the New Jersey Adult High School, learners earn regular 
school district diplomas. For sample selection, 300 GED graduates were 
randomly selected from a list of those who had passed the GED tests between 
January and April 1982. All had completed adult literacy instruction between 14 
and 1 8 months previously. Because a list of adult high school graduates could not 
be obtained, subjects were randomly drawn from five (of the nine) programs that 
had been selected for the outcome study and operated an adult high school. Data 
were collected through a short mailed survey for which the unadjusted response 
rate was 50 percent and the adjusted response rate was 64 percent. 

Findings 

Adult Basic Education 

In interpreting findings it should be noted that, in its April/May 1 984 data 
collection, the New Jersey Study surveyed all students who had enrolled in the 
preceding October. Thus the April/May 1984 data collection included the 60 
percent of students who were no longer attending the program at that time. 

Employment 

Seven to eight months after they had entered the program, the net gain in 
employment for all respondents was 12.5 percent. Of those who were 
unemployed and seeking work at the beginning of the study, the net gain in 
employment was 16.4 percent. 

Of those who were employed at the beginning of the study, 1 8 percent 
reported that they had changed jobs by the end of the study, and of these, 61 
percent said that they had obtained a better job. Sixty-five percent felt that their 
job performance had improved, 42 percent reported that they had received a raise, 
14 percent said that they had received a promotion, and 57 percent indicated that 
their job security had improved. 
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Of those who were seeking employment at the time of data collection, 79 
percent reported that they believed that their participation in ABE would improve 
their employment prospects. 

Basic Skills 

Learners were asked whether participation in ABE had helped them to 
become better readers. Eighty-nine percent responded yes. They were then asked 
if reading outside class had enabled them to do something they could not do 
before. Two thirds responded yes. Then learners were asked to name some of the 
things they could now do with their acquired reading skills. One hundred and 
fifty-eight answered with 345 ways in which they had employed their reading 
skills. Reading newspapers, reading magazines, and reading books were 
mentioned by 20 percent or more of the learners. 

Sixty-three percent indicated that classes had helped improve their wnting, 
and use of writing outside of the classroom was reported by 49 percent of the 
respondents. Eighty-five percent of the learners said that participation in ABE 
had helped their math skills and 58 percent indicated that they used their newly 
acquired math skills outside the classroom. Finally, respondents were asked if 
participation had helped them with other things. Social studies (26 percent) and 
interpersonal skills (20 percent) were mentioned most often. Only 4 percent 
mentioned job-seeking skills. 

Personal Goals 

Respondents were asked to what extent participation in ABE had helped 
them to reach their own personal goals. Thirteen percent responded with 
“totally,” 49 percent answered with “a lot,” 26 percent said “some,” and 12 
percent responded “a little” or not at all. Twenty-one percent reported that the 
class had helped them earn high school certification, 67 percent indicated that the 
class had not, and 12 percent said that they already had a high school diploma. 

Further Education 

Sixty-three percent of the learners reported that they planned to enroll in 
further education and training in the future, 18 percent said they did not plan to 
engage in further education, and 19 percent indicated that they were uncertain. 
Computing (20 percent), secretarial studies (18 percent), and allied health (17 
percent) were the most commonly mentioned areas of planned future study. 
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Twenty-three percent planned to study in a community college, 19 percent 
planned to study in a public vocational-technical school, and 1 1 percent planned 
to study at a four-year college. 

Public Assistance 

Twenty-seven percent of the learners reported that they had received some 
sort of public assistance since the preceding October. Of these, two thirds said the 
amount of public assistance had remained the same, 1 9 percent said that the 
amount had decreased, and 1 5 percent reported that public assistance had been 
eliminated. Of those for whom public assistance had terminated, 44 percent said 
the reason was job acquisition, and 9 percent reported that the reason was 
increased income. Nearly half (48 percent) indicated that the reason was other 
than the reasons listed. 

Other Outcomes 

When asked in general whether they felt better about themselves as a 
result of attending the adult education program, 92 percent responded in the 
affirmative. Of those who had children, 75 percent reported that they helped 
children with homework more, 81 percent said that they talked to their children 
more about school, 73 percent reported that their children had developed a better 
attitude towards school, 75 percent reported that their children were getting better 
grades, and 50 percent indicated they had become more involved with the schools. 
In closing, the New Jersey study’s survey asked learners to specify the single most 
important benefit they obtained from participation. Thirty-nine percent indicated 
academic skills, 32 percent said enhanced self-confidence, 10 percent indicated a 
GED or high school diploma, 9 percent reported job-related benefits, and 5 
percent indicated enhanced personal skills. 

High School Completion 

Employment and Earnings 

Of those graduates who were initially unemployed and seeking 
employment, 58 percent obtained employment. This gain is offset by 15 percent 
of graduates who were initially employed but lost their jobs. Of those who were 
initially employed part-time, 44 percent obtained full-time employment. Seven 
percent of those who were employed full-time at the outset of the study became 
employed part-time. 
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Of those who were employed both at the beginning and end of the study, 

45 percent reported that they had obtained better jobs and 93 percent reported 
gains in earnings. In addition, 29 percent indicated that they had received 
promotions, 78 percent reported that they were more likely to keep their jobs, and 
76 percent said that they were able to do their jobs better. 

For graduates who were employed full-time at the beginning and end of 
the study, the increase in weekly take-home pay was $64. For graduates employed 
part-time, both initially and at the termination of the study, the increase was $34. 
For graduates who were employed part-time but obtained full-time employment, 
the gain was $88. For the total sample, the increase in take-home pay was $26. 
Average number of months employed increased from 7.7 to 9.0. 

Other Benefits 

Incidence of welfare enrollment decreased 45 percent. Twenty-nine 
percent of graduates enrolled in college and 31 percent enrolled in a trade or 
technical school. On a 5-point scale, the mean score was 3.7 to the question, “The 
classroom instruction I received was very helpful in preparing me to take the GED 
test.” 



It should be noted that the economic benefits gained in 14 to 16 months 
after receiving a high school diploma for the New Jersey study’s population were 
considerably higher than those reported by other studies. Given the study’s 
limitations, the reasons for these abnormally high gains are not known. 



Strengths 

• In its description and explanation of research methods and procedures, the 
New Jersey study is a model of clarity and completeness. Considerable 
attention was paid to explaining technical complexities in terms the lay person 
could understand. 

• The response rate for the ABE study was extremely high in comparison to 
other studies, and the response rate for the “benefits to high school 
completion” component was quite good. 

• The variables and survey items eliciting them were well-conceived and clear. 
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Limitations 

• Because there was no control group it is difficult to infer that participation in 
adult literacy education or high school completion caused noted outcomes. 

• Differences are not measured differences in a pre-survey and a post-survey. 
Rather, differences are based on respondents’ recall of their status in the 
October preceding the April/May data collection. 

• All findings are based on learner self-report. 

• With respect to the ABE study, although the time period covered by the study 
was six to seven months, the actual hours of instruction received by 
respondents may have been relatively short. Although the mean hours of 
instruction received per respondent over the duration of the study is not 
reported, an attrition rate of 60 percent is reported and average hourly class 
attendance by month is reported as; October=32 percent, November=26 
percent, December=13 percent, January=14 percent, February=13 percent, and 
March=15 percent. 



9. The Utah Study 



Mahaffy, J. E. (1983). Impact evaluation of adult basic education program 
outcomes: ABE Evaluation Project final report. Bozeman MT: Montana State 
University Department of Educational Services. 



The purpose of the Utah study was to determine the impact of 
participation in adult basic education. Impact was measured on two dimensions: 
impact on the quality of learners’ lives and financial impact. Impact on students’ 
lives was measured by such variables as gains in knowledge, skills, community 
participation, self-confidence, and further education. Financial impact was 
measured by gains in employment and income. 

The sample for the study included two groups — currently enrolled learners 
and learners who had left the program within the previous five years. Because 
former students were defined as those who terminated the program for any reason, 
it cannot be assumed that they left the program because of completion. On the 
average, currently enrolled students had received 214 hours of instruction and 
former students had attained 338 hours of instruction. The time of termination of 
learners in the former student group varied from one year prior to measurement to 
five. 
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For both groups, the design called for random selection from a sample of 
programs stratified according to program size. Nine programs were selected for 
the study from the 19 operating at the time, and students and former students were 
randomly selected for participation. However, because many of those selected 
could not be contacted and others refused to participate, the random selection 
procedure failed to secure a sufficient number of subjects to conduct the study. 

As a result, additional students were selected in the order that they appeared on 
the lists used for selection. “Ultimately 516 former and current students were 
interviewed, 328 of the 381 former students desired and 188 of the 190 current 
students desired” (Mahaffy, 1983, p.l8). Thus, although programs were selected 
randomly, the research subjects were not. Current students differed from former 
students on several demographic dimensions including gender and age. Data were 
collected with a focused survey instrument administered by personal interview at 
program sites and by interviewers trained by the project. 

Findings 

Nineteen percent of current students and 72 percent of former students 
reported that their goals had been achieved. The difference is not surprising given 
that one would expect current students to terminate once their goals had been 
achieved. 

Basic Skills 

Perceived improvement in basic skills was measured on a 3 -point scale 
that ranged from “a lot” to “little or none.” For mathematics 52 percent of the 
current students and 47 percent of the former students responded “a lot.” 
Comparable figures for other basic skills areas were; reading (current=43 percent, 
former=37 percent), writing (current=43 percent, former=41 percent), and 
communication (current=28 percent, former=53 percent). Thus for all areas 
except communication, current students reported greater improvement in basic 
skills than former students. 

Perceived improvement in life skills areas was also measured. Using the 
same scale, subjects responded “a lot” as follows: how to find and keep a job 
(current=21 percent, former=16 percent), how to get the most value from your 
money (current=21 percent, former=25 percent), rights and responsibilities as a 
citizen (current=25 percent, former=27 percent), knowledge of community 
resources (current=13 percent, former=24 percent), and health care for self and 
family (current=7 percent, former=9 percent). Current students were more likely 
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than former students to respond “a lot” to use of the library and to reading 
magazines and books. Former students were more likely to participate in 
community organizations and to vote. There were no differences between the 
groups with respect to attending parents’ meetings at school and talking to school 
officials about their children. 

Further Education 

Sixteen percent of the current students had enrolled in other educational 
programs prior to enrollment in ABE. Thirty-six percent of former students had 
enrolled in other educational programs after termination from ABE. However, 
only 1 7 percent of the former students indicated that ABE had influenced their 
desire to enroll in other education programs. Eighty-four percent of current 
students reported a desire to enroll in additional educational programs and 86 
percent of former students indicated a desire. 

Personal Lives 

Learners were asked to report on perceived changes in their personal lives 
on a 5-point scale with “much better” anchoring the high end of the continuum. 
Based on responses of “much better,” former students were slightly more satisfied 
with their lives in general than current students (former=63 percent, current=57 
percent). Former students were also more likely to report that they were getting 
on well with their jobs (former=30 percent, current=18 percent), had higher levels 
of self-confidence (former=67 percent, current=59 percent), and were more 
satisfied with their family lives (former=52 percent, current=45 percent). There 
were no significant differences between the groups with respect to getting along 
with other people and feeling good about one’s self. Former students were more 
likely to report that as a result of participation in ABE their relationship with their 
children had improved and that their children were more interested in school. 
Seventy-three percent of former students and 21 percent of current students 
reported positive changes in their lives from participation in ABE. 

Financial Impact 

At the time of the interview, 28 percent of current students were receiving 
some sort of government financial assistance. The rate for former students was 25 
percent. At the time of the interview, 32 percent of current students were 
employed and 49 percent of former students were employed. The mean annual 
income for current students was $6,422 and the comparable figure for former 
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students was $7,483. Twelve percent of current students and 57 percent of former 

students reported that their financial condition had improved. 

Strengths 

• The report is comprehensive and clearly written. Methods and procedures are 
adequately explained. Findings are presented in detailed tables. 

• The study included a wide range of variables. 

• By and large the number of subjects is adequate. 

• The study includes a comparison group. 

Limitations 

• All findings are based on learners’ self reports. The recall time for former 
students could have been as long as five years. 

• The comparison groups are conceptually flawed. If one wanted to measure the 
impact of adult literacy instruction, a group that had received no instruction 
would logically have been compared to a group that had received substantial 
instruction. However, the current student group had received adult literacy 
instruction. In fact, at the time of data collection, current students had 
received an average of 214 hours of instruction while former students had 
received 338. More important, the former student comparison group included 
students who had terminated the program for some reason. Thus this group 
includes successful completers as well as program dropouts and there is no 
way to distinguish between the two. 

• Hours of instruction are not controlled. 

10. The Maryland Study 



Walker, S. M., Ewart, D. M., & Whaples, G. C. (1981). Perceptions of program 
impact: ABE/GED in Maryland. [Paper presented at the Lifelong Learning 
Research Conference, College Park, MD, February 6 and 7, 1981] 



The purpose of this study was to assess learners’ perceptions of ABE/GED 
program impact in Maryland. The subjects were 120 participants in ABE/GED 
programs who, at the time of data collection, had received an average of six 
months of instruction. Forty-five percent of the subjects had been enrolled for 
three months or less. It was originally intended that five subjects would be 
randomly selected from a series of ABE programs stratified according to program 
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size in three Maryland counties, but because some selected subjects refused to 
participate and others requested to participate, the random selection protocol was 
not strictly followed. Of the 120 subjects who volunteered for the study, adequate 
data were collected from 116. 

Data were collected by graduate student interviewers using an instrument 
based on prior research and discussions with learners, administrators, and 
teachers. Interviews ranged from 13 minutes to an hour and averaged about 20 
minutes. Interviews were conducted near the end of the semester, and at that 
time, some classes were so small that another class had to be selected from the 
same center. 

Findings 

Employment 

Of the subjects who were unemployed and the 20 percent who were 
homemakers, 84 percent desired to obtain employment in the future and 85 
percent believed that participation in ABE would help them do so. Of those who 
were employed, two thirds believed that ABE would help them gain promotion. 

63 percent perceived themselves as being able to do their jobs better. 

Basic Skills 

Seventy-six percent of the students reported that they could read better as a 
result of participation, 81 percent perceived that they could write better, 90 
percent believed that their computational skills had improved, and 63 percent 
reported that they were better shoppers. Math was the most commonly mentioned 
most useful thing learned (52 percent). 

Community Involvement 

About half the participants reported that they belonged to at least one 
community organization, although 70 percent of the learners said that 
participation in ABE had not affected their membership in community 
organizations. Half the subjects indicated that they had recently used the library. 
Fifty-three percent expressed some interest in politics and 35 percent claimed that 
they were more interested in politics since having participated in ABE. Fifty- 
eight percent reported that they were more aware of community services. 
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Attitudinal Change 

Eighty-nine percent of the students reported that they felt different about 
themselves as a result of participation. Of those with school-age children, 52 
percent reported that they were more confident about discussing problems with 
their children’s teachers. 

Personal Relationships 

Of the 86 percent of respondents who had children and reported that their 
children asked for help with homework, 61 percent indicated that they were more 
confident about giving help with homework. 

Continuing Education 

Ninety-one percent of the learners said that they planned to continue their 
education once they had completed the ABE program. 

Strengths 

• The report is clearly written. Methods and procedures are adequately 
described and the report is honest about its limitations. 

• The study is comprehensive with respect to the variables studied. 

Limitations 

• Because the research subjects were not selected randomly, and because all 
subjects came from only three Maryland counties, the results cannot be 
generalized to Maryland or to any larger unit. 

• Almost half the research subjects had been enrolled for three months or less. It 
is questionable whether there was a sufficient amount of instruction gained 
between enrollment and data collection for significant impact to be measured. 

• The study lacked a control or comparison group. Thus it is questionable to 
inf er that participation in ABE caused the changes noted. 

• All findings are based on learners’ self reports. 
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11. The 1980 Tennessee Study 



Jones, P., L., & Petry, J., R. (1980). Evaluation of adult basic education in 
Tennessee, 1980. Memphis, TN: Memphis State University, College of 
Education. 



The 1980 Tennessee Study measured learners’ perceptions of their 
educational experience in Tennessee ABE programs. Data were collected with a 
26-item instrument. Variables were selected from the research literature and 
responses were recorded on 5-point Likert scales. Variables included self- 
expression, self-concept, family life, life in general, leisure, relationships with 
others, and relationships with society. Thus, unlike most of the other impact 
evaluations assessed in this report, the 1980 Tennessee Study focused primarily 
on the affective dimension. 

To select a sample, first a sample of programs across the state was 
identified (the report does not say how). Then each program was mailed a packet 
of surveys and the supervisor was asked to identify 25 students who would 
receive the survey. Each supervisor was asked to administer the survey 
personally. Information was received from 72 of the 89 programs then active in 
Tennessee. From a potential of 2,225 responses, 1,623 responses were received 
for a response rate of 73 percent. 

Findings 

Learners scored 3.9 on the 5-point scale measuring self-expression. 
Students who had been in the program longer, who were more affluent, and were 
older scored higher on this dimension. Students scored 3.9 on the scale measuring 
self-concept. Those who had participated in the program longer than 1 8 months 
scored higher on self-concept. Subjects scored 3.7 on the scale that measured 
whether they perceived that the ABE program had helped them to be more 
confident about their family relationships. Again, learners who had participated 
in the program longer scored higher. A composite of six items measured life in 
general, and subjects scored 4.0 on this measure. On the scale that measured 
whether learners’ interests had expanded to make their leisure time more 
meaningful, the mean score was 4.0. 

With respect to relationships with others, learners scored 4.0 on the scale 
used to measure this dimension. The mean score for the scale that measured 
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social responsibility was 3.8. Students scored 4.1 on the scale that measured their 

perceptions of the value of their education. 

Strengths 

• In general, the report is clear, and methods and procedures are adequately 
reported, although descriptions of methods are less detailed than in most of the 
other studies reviewed here. 

• The sample size is adequate (n= 1,623), and the 73 percent response rate to the 
survey is relatively high. 

Limitations 

• Rather than being randomly selected, the subjects for the study were selected 
by ABE supervisors. This interjects the possibility of selection bias. 

• Data from specific survey items seem to have been collapsed into the 
summative scale scores reported, and this results in a loss of detail. 

• All data are based on self-report. 

• There is no control group. 

• All subjects were enrolled ABE students. Because 91 percent had been active 
in the program for six months or less, it is questionable whether there was 
sufficient time for substantial impact to occur. Because all subjects were 
enrolled students, data were not collected from those who had terminated the 
program either because they had achieved their goals or had dropped out. 

12. The Ohio Study 



Boggs, D. L., Buss, T. F., &. Yamell, S. M. (1979). Adult basic education in Ohio: 
Program impact eva luation. Adult Education, 29(2), 123-140. 

The objective of the Ohio study was to determine the extent to which the 
goals of the Adult Education Act, the federal legislation that funded ABE in Ohio, 
were being achieved. Specifically the study sought to ascertain whether, as a 
result of participation in ABE programs, learners’ occupational status had 
improved, whether learners had been assimilated into society, and whether 
learners’ personal goals had been met. 

The population for the study was learners who had terminated ABE in 
1973-1974. Follow-up data were collected three years later. Data were collected 
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with two telephone surveys, one of which was administered to former students, 
the other of which was administered to adults who were eligible for ABE but had 
never participated. Although the comparison group is an important feature of the 
study, it is important to note that the comparison group is not a true control group 
as there was no random assignment to instruction and to control groups. Indeed, 
because the comparison group was representative of the general population 
eligible for ABE, and because the treatment group was representative of ABE 
participants, the two groups are different with respect to socio-demographic 
composition. 

A multistage sample design was used to create a sample representative of 
the approximately 27,000 ABE students in 1973-74. First, a stratification 
protocol was designed based on ABE program size, municipality size, gender, 
age, and race. Then, from these strata, 12 ABE programs were randomly selected 
^ to participate in the study. Of the approximately 3,500 former ABE students who 
had been enrolled in these programs, 1 ,200 had valid telephone numbers or 
addresses. Data were collected by telephone by trained interviewers in April and 
May of 1977. Three hundred and fifty-one valid responses were received. The 
average completion time per interview was 13.6 minutes and, on average, it took 
four telephone calls to secure a usable survey. The estimated sampling error was 
5.3 percent. 

A multistage sampling design was also used to establish the comparison 
group of adults who were eligible for ABE but had not participated. Initially the 
comparison group population was stratified into 88 counties. Then the counties 
' were grouped into 1 1 sampling areas in order to allow for urban-rural distribution. 
From telephone books acquired by the study, 1 ,500 comparison group subjects 
were randomly selected. Subjects were surveyed by telephone between May and 
July 1977. Over 90,000 calls were made at an average completion time per 
interview of 13 minutes. The completion rate was 13 to 1 and the estimated 
sample error was 2.4 percent. 

Findings 

Involvement 

Three years after they had left the ABE program, former participants 
scored significantly higher than non-participants on use of the library, reading 
magazines, use of social services, community activity, reported self-confidence. 



69 



er|c 



73 



NCSALL Reports #6 



January 1999 



and communication. Non-participants scored higher on continuing their 
friendships. 

Employment/Home Ownership 

Former ABE students scored higher than non-participants on incidence of 
employment, being promoted, and perceived job security. Non-participants were 
more likely to own their own homes than former ABE students, although the 
difference between the groups was small. 

Children/School 

Former ABE students scored higher on attending parents’ meetings. The 
differences between former ABE students and non-participants were not 
significant with respect to helping children with homework. Non-participants 
scored higher than former ABE students on communication with the schools. 

Voting 

Although former ABE students scored higher than non-participants on 
being registered to vote, there were no significant differences with respect to 
having voted in 1972 or 1976. 

Education 

Former ABE students were more likely to be enrolled in school and to be 
contemplating future enrollment than non-participants. 

Learners ’ Personal Goals 

Of those who stated that their goal at enrollment in the ABE program was 
to improve their English (27 percent of the sample), 96 percent reported that they 
had achieved the goal. Of the 77 percent of the sample who had as their goal 
improving math, 97 percent reported that they had achieved this goal. Sixty -two 
percent of the sample indicated that improving their reading was a goal at entry; 
96 percent reported that they had reached this goal. Of the 62 percent of the 
sample that indicated that obtaining a GED was one of their goals, 40 percent 
reported goal attainment. 
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Strengths 

• The study design was systematic, and the design and procedures are clearly 
described in the report. 

• The variables studied are comprehensive. 

• The number of subjects is adequate for detailed analysis. 

• The study includes a comparison group. 

Limitations 

• Because the Ohio study is one of the few that includes a comparison group, 
the limitations of the Ohio comparison group warrant discussion. There are 
two potential reasons for including control or comparison groups in an impact 
design: 1 . to enable researchers to conclude that participation in adult literacy 
education caused a particular impact and 2. to compare performance of adult 
literacy education students with a particular reference group with known 
characteristics. In the Ohio study, former ABE students were compared with 
non-participants. Unfortunately, this comparison does not allow us to infer 
that ABE caused the differences between the groups that were found. The 
reason is that non-participants differ substantially from participants in ways 
that have nothing to do with ABE instruction. For example, if Ohio is similar 
to the rest of the country, adults who are eligible for ABE but have never 
participated are likely to be significantly older than participants. Are the 
differences noted in the study due to participation or are they the result of the 
differences in age between the two groups? We do not know, and, not 
knowing, the inference that ABE caused the difference cannot be made with 
great confidence. 

• Although use of telephone interviews appeared to be an efficient way to 
collect data, those without telephones or valid addresses could not be reached. 
Of the 3,500 former ABE students selected for the study, only 1,200 had valid 
phone numbers and only 351 were eventually reached. Because only about 1 
in 10 of the original sample could be reached, the potential for response bias is 
high. 

• All data were based on self-reports, and the recall period for former ABE 
students was three years. 

• The study does not control for the amount of instruction learners received. 
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13. The Wisconsin Study 



Becker, W. J., Wesselius, F., & Fallon, R. (1976). Adult basic education follow-up 
study 1973-75. Kenosa, WI: Gateway Technical Institute. 



The objectives of the Wisconsin study were to measures student 
perceptions of the program, to identify economic and employment impact, to 
assess learners’ attainment of personal goals, to assess further education and 
social impact, and to measure exit level reading ability. 

The study’s description of design and procedures begins with the 
acknowledgment that certain contextual features of adult literacy education had to 
be taken into account in the design of the study. They were: lack of uniform 
grade levels, non-English speaking learners who were not literate in their native 
language, varying attendance, open enrollment, multiple goals for participation, 
and varying reading ability. Although these contextual constraints are well 
known by most experienced researchers of adult literacy education, the Wisconsin 
study is one of the very few that has acknowledged the problems these features 
pose in study design. The authors of the study note that, although a control group 
would have been desirable, establishment of a control group was beyond the 
means of the study and that, while collection of pre-data at the time of enrollment 
would have been desirable, such data were not available to the researchers. 

Data were collected from a sample of Gateway participants who had 
terminated in 1973, 1974, and 1975. Gateway was a comprehensive adult literacy 
program that operated learning centers and classes in prisons, communities, 
training schools, institutions for the handicapped, and ESL. Of the learners 
sampled in the study, 133 attended learning centers, 72 attended classes, and 65 
attended both learning centers and classes. In 1973 1,340 learners were enrolled, 
in 1974 1,133 learners were enrolled, and in 1975 the program served 1,703 
learners. 

To select a sample, first a 10-percent sample was randomly drawn from 
participant lists, and the 1 1 1 students who were still active in the program were 
eliminated. The resulting sample was then divided into four groups based on 
hours of instruction attained (category 1= 0 to 24 hours, category 2=26 to 50 
hours, category 3=51 to 100 hours, category 4=100 or more hours of instruction). 
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Each selected student was notified of the study by letter and was told that 
an interviewer would contact them if they indicated by return postcard that they 
were willing to participate in the study. Terminated students who volunteered 
were placed on one list («=306). In addition two other 10-percent samples were 
randomly drawn to serve as replacements if students from the first list could not 
be contacted (n=612). From the initial list and the replacement lists, a total of 593 
former students were contacted, and in total, 270 usable interviews were 
completed (response rate=.45). Sixty-four learners had completed in 1973, 95 had 
completed in 1974, and 1 1 1 had completed in 1975. In 1973, 67 percent of the 
sample had 50 or less hours of instruction. The figures for 1974 and 1975 are 40 
percent and 49 percent respectively. 

The survey instrument was based on input from teachers, aides, and 
counselors and researchers. Most interviewers were ABE professionals and 
paraprofessionals who were trained in a one-day session. Data were collected 
between March and June of 1976. In total, of 589 anticipated interviews, 273 
were completed and 319 could not be completed. Seventy percent of the non- 
completions were due to the inability to secure up-to-date addresses. As part of 
the interview, to measure reading level respondents were asked to take a one-page 
excerpt from the reading section of the Wide Range Achievement Test (WRAT). 
Interviewers were trained in the administration of the test. 

Findings 

Learners ’ Personal Goals 

Taken as a whole, 64 percent of terminated learners for all three years 
rated the program as having been very useful, 34 percent rated it as helpful, and 
only 3 percent rated is as not being helpful. Those with 100 or more hours of 
instruction were most likely to rate the program as being very helpful. Improving 
reading, math, and writing were the three most commonly given goals of 
participation. Ninety percent reported that the program had helped them in 
reading, 82 percent indicated that the program had helped them in math, and 83 
percent said that they had been helped in writing. Of the 44 percent of the sample 
that indicated that earning a GED was one of their goals, 27 percent reported that 
they had passed the GED tests. Of the 152 respondents who reported tha;t they 
wished to enter another educational program upon completion of the ABE 
program, 33 percent did enter another program, and 54 percent of them responded 
that the ABE program had adequately prepared them for the additional 
educational experience. When asked if they had completed what they wanted to 
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accomplish, 62 percent reported that they had not. Eighty-two percent indicated 
that they were satisfied with their accomplishments. 

Employment 

The most frequently mentioned job-related reason for participation (34 
percent of sample) was to learn a specific skill for a specific job. Thirty-six 
percent of those who indicated this reason reported that they had learned a skill 
sufficiently to qualify for the job they wished. Half the respondents had applied 
for jobs since termination from the ABE program, and of these, 81 percent had 
had job interviews. Of those who had applied for jobs since termination, 82 
percent said they had had either full-time or part-time employment. Of those who 
had been employed since termination, 41 percent reported that ABE did help them 
get a job. Twenty -three percent reported that they terminated the ABE program 
because they obtained a job. In total, two thirds of the respondents reported no 
change in employment status since terminating from the program. Twenty-three 
percent increased employment and 1 1 percent experienced a decrease in 
employment. 

Family Relationships 

Learners with school-age children were asked if participation in ABE had 
helped them understand their children’s teachers and school better. Sixty-four 
percent replied in the affirmative. Sixty percent indicated that participation had 
not encouraged their participation in teacher conferences and PTA, and 57 percent 
responded that participation had not changed the way they worked with their 
children. Twenty-nine percent responded that, since participating in ABE, 
household members got along better with each other. 

Perceptions of the Value of the ABE Program 

Learners were asked to indicate the subject that had been most helpful 
when they were enrolled in ABE, and, of all the subjects listed, math (17 percent) 
and English (16 percent) were considered to be most important. Forty-four 
percent of the respondents reported that ABE had helped them to become better 
off financially; 41 percent mentioned that ABE had helped them manage money 
better. When students were asked to specify the most important changes from 
participation, improving self-confidence (23 percent) was the most commonly 
mentioned change and overcoming shyness (21 percent) was the second most 
commonly mentioned change. 
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Community Participation 

Eighty-four percent of the respondents indicated no change in their voter 
registration status, and of those who did become registered to vote, only 5 percent 
said that they had done so because of ABE. Eighty-five percent indicated no 
change in community group participation. Of those who did increase community 
group participation, only 4 percent attributed the change to ABE. 

Reading Gains 

As part of the interview, respondents were administered the reading 
portion of the Wide Range Achievement Test (WRAT). These scores were 
compared with earlier scores obtained at both enrollment in the program and 
shortly before termination. Seventy-eight percent exhibited an increase of grade 
level between enrollment and follow-up and 19 percent showed a decline. 
Seventy-three percent of the respondents reported that they purchased 
newspapers, magazines, and books prior to participation in ABE, while 84 percent 
said they did so at follow-up. However, 62 percent indicated that participation in 
ABE had not influenced their purchase of these reading materials. Fifty-five 
percent said that they read more since the time they entered ABE, 37 percent said 
they read the same, and 8 percent said they read less. 

As was mentioned in the description of methods and procedures, the 
sample was divided into groups based on the number of hours of instruction 
attained. Although data by these groupings are reported throughout the findings, 
they have not been reported here for simplicity’s sake. 

Strengths 

• The report is extremely detailed in both the descriptions of methods and 
procedures and in its findings. 

• The study includes a multitude of impact variables. 

• Sampling procedures are adequate. 

• The number of subjects is adequate for most analyses. 

Limitations 

• All data, except for the WRAT scores, are based on self-report. 

• The validity and reliability of the one-page excerpt from WRAT is not known. 
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• There is no control or comparison group. 

• Much of the data pertain to a comparison between learners’ goals at the time 
of entry and at follow-up. However, rather than collecting learner goal data at 
entry, the study asked respondents to recall their goals at the follow-up data 
collection. Since follow-up data were collected in 1976, the recall period for 
the 1973 cohort was three years and for the 1974 cohort it was two years. 

• The response rate of .45 inteijects the possibility of response bias. 

• In the 1973 cohort, two thirds of the former students had achieved 50 or less 
hours of instruction. This raises the question of whether an adequate amount 
of instruction had been gained for changes to be measured. 

• The findings cannot be generalized beyond the Gateway service area. 

WELFARE STUDIES 

With the establishment of the federal JOBS program in 1988, the numbers 
of adult literacy education learners served in welfare-sponsored and -funded 
programs increased dramatically. Although the specific details and provisions of 
JOBS programs varied by state, and to some extent by county, a common 
provision of JOBS-funded adult literacy education was mandatory participation 
for those who were deemed in need of basic education. In addition, JOBS clients 
also received support services, such as transportation and child care, and were 
required to participate in employment readiness activities such as job search. 

JOBS clients who were not eligible for adult literacy education were either 
enrolled in vocational education or slotted directly into employment. Most JOBS 
adult literacy education activities were subcontracted by the welfare system to the 
traditional providers of adult literacy education — public schools and community 
colleges. 

The relationships between education/training, welfare participation, and 
work were hotly debated during the national discussion that surrounded welfare 
reform in 1996, a point of contention being whether it was better to invest in 
substantial education and training prior to requiring welfare recipients to acquire 
employment, or whether it was more socially beneficial to move welfare 
recipients directly into employment with education and training coming 
afterwards. The Personal Responsibility Act of 1996, which established major 
reform of the nation’s welfare system, promoted the work first, educate second 
principle as national policy. 

The purpose of this section of the report is neither to debate welfare policy 
nor to discuss how the research literature informs that debate. Rather, the purpose 
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here is to analyze the research conducted on the impact of adult literacy education 
within welfare-sponsored programs for its contribution to a general understanding 
of the impact of adult literacy education. 

14. The California GAIN Study 



Martinson. K., & Friedlander, D. (1994). GAIN: Basic education in a welfare to 
work program. New York: Manpower Development Research Corporation. 



The above report is one of several issued by the Manpower Development 
Research Corporation (MDRC) to convey the findings of its evaluation of the 
California GAIN program. GAIN (Greater Avenues for Independence) was 
California’s program for increasing employment and reducing dependence on 
AFDC (Aid to Families with Dependent Children). GAIN, which was initiated in 
1986, became California’s JOBS initiative in 1989 and accounted for 13 percent 
of all federal money allocated to JOBS. MDRC’s evaluation of GAIN is 
exceedingly detailed and refined in its methodology and procedures, and for that 
reason represents a very important contribution to our understanding of adult 
literacy education’s impact. However, the fact that the study focused on a special 
population of adult literacy education — clients who were eligible for welfare and 
were required to participate in adult literacy education — is a limitation that 
precludes the generalization of its results to adult literacy in general. 

Enrollment in the GAIN program began with an initial assessment of 
eligibility. According to the law, single welfare recipients with children under age 
six were exempt, but could volunteer for the program. After registration, clients 
attended an orientation and assessment, which included administration of the 
CASAS test. Those who scored below 215 on either the CASAS reading or math 
tests, lacked high school certification, or could not speak English well were 
assessed as being “in need of basic education” and assigned to programs for either 
GED preparation, ABE, or ESOL, depending on their skill levels. Participation 
was mandatory; clients who failed to participate could incur a reduction of their 
benefits. After completion of basic education, clients participated in a formal 
assessment where an individual employment plan was developed that could 
include additional training, unpaid work, or supported work. Most adult literacy 
education instruction was contracted by GAIN to California’s public schools and 
community colleges. 
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The GAIN evaluation was conducted in six counties: Los Angeles, 
Riverside, San Diego, Alameda, Tulare, and Butte. Although there were 
similarities among the counties with respect to implementation of the law, the law 
was sufficiently flexible to allow for differences in administration. Together these 
six counties accounted for about one third of the entire California caseload. 

The evaluation employed a random assignment design. Between 7 and 14 
months after the county implemented GAIN, all clients who were determined to 
be mandatory registrants and who attended an orientation session were randomly 
assigned to either a treatment group, which received GAIN services and was 
subject to the participation mandate, or to a control group, which did not receive 
GAIN services and was not subject to the mandate. Control group members were 
permitted to enroll in adult literacy not sponsored by GAIN if they wished to. 

Data Sources included: 

• Baseline data {n=l\, 596). Included basic demographic information and 
CASAS test scores. 

• AFDC and UI records (n=2 1 ,392). Included data on welfare payments and 
employment earnings. 

• Registrant survey data {n=3,2\0,2,55\ valid cases). Administered to a 
stratified random sample of treatment and controls by phone or in person two 
to three years after random assignment. The survey measured self-reported 
participation in a number of activities as well as GED acquisition. Subjects 
were paid $25. 

• A literacy test (TALS) (n=l, 719, 1,1 19 valid cases). The Test of Applied 
Literacy Skills was administered in the client’s home two to three years after 
random assignment. Subjects were paid $35. Data were collected from 
treatment and controls who were not ESOL students. 

• GAIN Program Tracking Data (n=4,523). Recorded clients’ patterns of 
GAIN participation from case files. 

• Attendance data {n=\, 222). Data on attendance were collected from 
educational providers. 

• Stajf Activities and Attitudes Survey. This survey was administered to GAIN 
staff twice, one and two years after GAIN began. 

• Field Research. Interviews with administrators and teachers to provide 
descriptive information about GAIN program operations. 
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Findings 

Participation in adult literacy education was mandatory for the GAIN 
treatment group and was not mandatory for the control group. As might be 
expected, for the two- to three-year period, the participation rate in ABE/GED for 
the treatment group (30.2 percent) was significantly higher than for the control 
group (5.8 percent). Differences in the participation rates of the two groups for 
vocational (9.4 percent treatment, 9.5 percent control) and post-secondary 
education (11.6 percent treatment, 9.8 percent control) were not significant. The 
same relationships were found in length of participation as measured by both the 
number of months of participation and the number of hours of participation. 

GED Attainment 

Over the two- to three-year follow-up period, GAIN treatment group 
participants were considerably more likely to have earned a GED or other high 
school certification (treatment=7.2 percent, control=1.3 percent), and. not too 
surprisingly, success in obtaining high school certification was found to be highly 
related to the learners’ initial educational levels as measured by the CASAS test. 
Perhaps the most striking finding regarding high school certification came from 
Tulare County, where the difference between treatment and control groups with 
respect to earning high school certification was 19 percent. This compares with a 
difference of only 7.7 percent for the next highest county. Alameda. \^Tly was 
Tulare County’s performance on high school certification so outstanding in 
comparison to the other counties? Although there is no definitive answer offered 
by the report, the authors note that this “may indicate that the package of 
educational services offered in Tulare — the GAIN program's emphasis on 
education, the counseling and close monitoring GAIN provided, and the education 
services offered by the education providers — contributed to the large attainment 
impacts in Tulare” (Martinson & Friedlander. 1994, p. 1 18). 

Tested Learning Gain 

Learning gain was measured by the document and quantitative literacy 
components of the TALS. The document literacy section (26 items) measures the 
skills needed to work with documents such as use of charts and forms. The 
quantitative literacy section (23 items) measures the ability to perform 
calculations embedded in text. Although the TALS includes a prose literacy 
section, it was not used in the GAIN study. 
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The difference between the treatment group and control group on the 
combined document and quantitative portions of the TALS was a statistically non- 
significant 1 .8 points, suggesting that, on the whole, no meaningful tested 
learning gain had occurred two to three years after random assignment to groups. 
During that time, treatment group members had received an average of 25 1 
scheduled hours of instruction. Differences between counties were striking, 
however. Although the control groups in Riverside and Tulare actually 
outperformed the treatment groups, the treatment group in San Diego 
outperformed the control group by a very sizable 33.8 percent. Commenting on 
this, the report states that “the county differences would appear instead to reflect 
differences in the implementation of the basic education program across counties” 
(Martinson & Friedlander, 1984, p. 131). 

As with acquisition of high school certification, initial educational 
achievement as measured by the CASAS had an important impact on tested 
learning gain. For those whose initial CASAS scores were 215 or above on both 
the reading and math tests, the difference between the treatment and control 
groups was 17.8 points, 19 percent of a standard deviation, while the differences 
for those who scored below 214 for both tests were -17.1, indicting that the 
control group outperformed the treatment group. 

Strengths 

• The report is very detailed in its descriptions of methods, procedures, and 
findings. Limitations are discussed in-depth. 

• The study utilized a true experimental design; subjects were randomly 
assigned both to treatment and control groups. As a result, solid inferences 
regarding causality can be made. 

• The duration between treatment and measurement (two to three years) is 
sufficiently long for meaningful impact to have occurred. 

• Inferences are based on tested gain rather than on learner self-report. 

Limitations 

• The subjects of the research were a special population of adult literacy 
education learners (welfare clients). Participation in the program was 
mandatory. Hence the study can only be generalized to this population. 

• Only two impact variables were studied — tested learning gain and acquisition 
of high school certification. 
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• The Test of Applied Literacy Skills used to measure learning gain may not 
have been sensitive enough to measure the learning gains that occurred. 

Comment 

The California GAIN study is one of the very few that used a true 
experimental design, and for this reason we may be more confident that adult 
literacy education instruction caused the reported impacts than for other studies 
we have thus far reviewed. However, experimental design studies are not without 
their own weaknesses. Paramount among these is that we do not know why or 
how these impacts were achieved. Thus we are left with an unresolved ambiguity. 
Why did Tulare County produce such high gains on acquisition of high school 
certification and poor gains on tested learning gain? Likewise, why did San 
Diego County show very high gains on tested learning gain and low gains on high 
school certification? Although the authors of the report speculate that to some 
extent the answers lie with the manner in which the two programs were 
conducted, we do not know what the relevant program differences were. 

The authors of the GAIN study concluded that significant gains were not 
made in basic skills, and this appears to be true in the aggregate. However, given 
the large gains in San Diego, an equally reasonable conclusion might have been 
that gains were made in some programs, but not in others. The conclusion on 
GAIN, then, depends somewhat on whether one considers the proverbial cup to be 
half full or half empty. 

15. The Texas JOBS Program Evaluation 



Center for the Study of Human Resources. (1994). Texas JOBS program 
evaluation: Final Report. Austin, TX: Center for the Study of Human Resources, 
Lyndon B. Johnson School of Public Affairs, University of Texas at Austin. 
Schexnayder, D. T., & Olson, J. A. (1995). Texas JOBS program evaluation. 
Second year impacts. Austin, TX: Center for the Study of Human Resources, 
Lyndon B. Johnson School of Public Affairs, University of Texas at Austin. 



Like the California GAIN evaluation, the Texas JOBS evaluation is a 
sophisticated and rather complex assessment of a state welfare program. 
Although much of the evaluation focuses on the economic benefits that accrue to 
JOBS clients and to the state, the effects of participation in the adult literacy 
education component of the JOBS program are addressed. 
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In Texas, during its first two years (1991-1992), 84,000 APDC recipients 
were enrolled in the JOBS program. Almost all were women. Upon initial 
assessment, JOBS clients were separated into three service levels. Level I clients 
possessed high school certification, had recent job experience, and had few 
barriers to employment. They were certified as being Job ready. Level II clients, 
who were designated as being less Job ready, lacked high school, had less recent 
Job experience, and had more barriers to employment. Level III clients had very 
little education and substantial barriers to employment, and, because helping them 
was beyond the resources of the program, they were not served by JOBS. 

Although Level I clients were generally referred to Job readiness and Job search 
activities, they were permitted to participate in education and/or job skills training 
if they wished. Level II clients were generally assigned to adult education and 
survival skills training before they received Job skills training. More than one 
third received child care assistance and three fifths received transportation 
assistance. 

In the 1994 report, program impacts were measured a year and a half after 
program entry. The evaluation used a quasi-experimental design. A treatment 
sample was created by randomly selecting six samples of 3,300 JOBS clients by 
calendar year quarter starting with the first quarter of 1990 and ending with the 
second quarter of 1992 (total «= 19,854). Clients who entered the program during 
specific quarters are termed cohorts in the study, and much of the analysis is 
presented by cohorts. 

For the purposes of the study, a client was defined as a female recipient of 
AFDC with at least one hour of participation in any JOBS component. To 
actualize the quasi-experimental design, a matched comparison group of 
approximately 20,000 was selected from non-JOBS AFDC recipients. This was 
achieved by first matching exactly treatment group members with non-JOBS 
AFDC clients on Department of Human Services region, service level, 
race/ethnicity, groups of employment services, and presence on AFDC rolls 
during the quarter in question. Then the sample was adjusted with a procedure 
that paired comparison group members with treatment group members who were 
similar with respect to age, number of children, age of youngest child, and total 
time on AFDC. A comparison between the treatment and comparison group on 
these variables showed no significant differences. 

Consistent with an evaluation of a welfare program, the impact 
(dependent) variables used included probability of exit from AFDC, probability of 
exit from AFCD to employment, probability of employment regardless of AFDC 
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exit, quarterly earnings, and probability of APDC recidivism. The impact of 
education, which was measured by the number of hours of participation in high 
school, GED preparation, ABE, post-secondary education, and self-initiated 
education, was analyzed through multiple regression analysis. Multiple 
regression analysis is a statistical procedure that measures the impact of each of a 
series of independent variables on a dependent variable. In this case, the 
dependent variables were each of the impact variables listed above. The 
independent variables were the various components of the JOBS program. They 
were: education, whether the client had only received assessment, life skills 
training, job training, job search activities, job assessment, and whether the client 
was currently engaged in any JOBS activities. Data were collected from JOBS, 
JTPA, and other program records. 

The findings reported here are from the second year impact report 
(Schexnayder & Olson, 1995). Because about one third of the comparison group 
members had entered the JOBS program by the time the second year study was 
conducted, they (n=6,460) and their counterparts in the treatment group were 
eliminated from the study. For the second year sample, the treatment group 
sample size was 13,196 and the comparison group sample size was 13,303. 

Impacts were measured between two and two and a half years after entry into the 
program. 

Findings 

Thirty-two percent of the Level I clients and 59 percent of the Level II 
clients had received education. It should be noted that education included high 
school, GED preparation, ABE, and post-secondary education. Thus it is 
impossible to identify the specific impact of any of these components. 

Termination of AFDC 

In total, participation in JOBS had either a negative or statistically non- 
significant effect on termination of AFDC when measured four quarters after 
entry. However, “By the last quarter measured, JOBS participation increased 
AFDC exits by 4-16 percent in all but one group” (Schexnayder & Olson, 1995, p. 
12). For both Level I and Level II clients, participation in education had a 
statistically significant impact on exit from AFDC. Although job training also 
had a significant impact, life skill education, job search, and job assessment did 
not. 
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Exit from AFDC to Employment 

A consistently greater percentage of treatment group members than of 
comparison group members exited AFDC for employment. For both Level I and 
Level II clients, the differences between the treatment and comparison group were 
in the 1 to 8 percent range, and the differences tended to increase over time. 
Participation in education and in job training had a statistically significant impact 
on exit from AFDC to employment. Participation in life skills did not. 

Employment Regardless of AFDC Exit 

One year after entry to the JOBS program, the employment rates for the 
treatment and control groups were similar and ranged from 38 to 45 percent. 

Over time, however, the treatment group outperformed the comparison group in 
all cohorts except one. The impact of education on employment was statistically 
significant for Level I clients (those who were certified as being job-ready) but 
not for Level II clients, and the impact of job training was significant for both 
levels. The impacts of life skills education, job search, and job assessment were 
not significant. 

Quarterly Earnings 

Earnings for both treatment and comparison group members increased 
over time. After 10 quarters, JOBS participants’ earnings were significantly 
higher than comparison group members’ earnings for 5 of the 10 cohorts. Adult 
education had a positive and significant effect on earnings for both Level I and 
Level II clients, as did job training. The impacts of life skills education, job 
search, and job assessment were very small and mostly not statistically 
significant. 

Return to AFDC 

JOBS participants were more likely to return to welfare after they exited 
than the comparison group. However, education, job training, and job search 
reduced recidivism for Level I clients. Participation in job training was the only 
variable that reduced recidivism for Level II clients. 

Strengths 

• Although it is complex, the report adequately describes methods, procedures, 
and limitations. 
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• The sample is representative and quite large. 

• The design includes a relevant comparison group. 

• The measures are direct rather than based on self-report. 

Limitations 

• The research subjects are drawn from a special population (welfare recipients) 
and thus the findings cannot be generalized to adult literacy education in 
general. 

• Because education is defined as an amalgam of high school, GED preparation, 
and post-secondary education, the separate impact of these components cannot 
be determined. 



16. Steps to Success 



Reder, S., & Wiklund. K. R. (1994). Steps to Success: Literacy development in a 
welfare-to-work program. Portland, OR: Northwest Regional Educational 
Laboratory. 



Established in 1990, Steps to Success was Oregon's largest JOBS program 
serving clients at two locations — in Multnomah County and the Portland 
metropolitan area. It was administered by Mt. Hood Community College. After 
screening, which included a one-hour interview and the Basic Adult Skills 
Inventory System (BASIS) test, clients were placed in one of two tracks. Those 
who were assessed to have marketable skills were placed in the Placement Track 
and participated in a mix of job training, job search, and placement activities. 

Those who were deemed to be in need of adult literacy and job training 
were placed in the Career and Life Planning Track (CLP). Those assigned to the 
CLP track generally first participated in a four-week course designed to facilitate 
job readiness and future self-sufficiency. After the initial course, those whose 
BASIS tests indicated that they needed literacy education participated in 
ABE/GED. Attendance was required for a minimum of 1 5 hours per week. One- 
on-one instruction in a learning lab and small group instruction was available. 

The Steps to Success outcome study included both qualitative and 
quantitative components, but, because the qualitative component is not 
extensively reported, the case analysis will focus on the quantitative. The study 
commenced with the development of a sample. To this end a list was secured of 
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all clients who had participated in ABE/GED (n=920). The records for each case 
included names and addresses, basic demographic information, the BASIS score 
from intake assessment, and participation in Steps to Success Activities. Letters 
were then sent to all members of the list inviting their participation in the study 
and promising a $25 stipend to those who agreed. Eighteen data collection 
sessions were offered for the research subjects. Subsequently a second mailing 
was sent and six additional data collection sessions were scheduled. 

During data collection, subjects were administered a questionnaire that 
compiled participants’ activities and perceptions including current education and 
employment status, additional education and employment since leaving Steps, 
perceptions of how Steps had helped them in various aspects of their lives, 
whether Steps might have prepared them better, and perceptions about 
improvements in their basic skills. 

Two tests were employed. The baseline test, the BASIS, is an adaptation 
of the CASAS Test specially prepared for Steps by CASAS. The second test 
administered during data collection was the CASAS (level C) itself. Data 
collection occurred between one and three years after the clients had terminated 
Steps. 

One hundred and nine of the 920 letters sent were returned by the Post 
Office. Of the remaining 811, 229 agreed to participate in the study, for a 
response rate of 27 percent. Of those who agreed, data were eventually collected 
on 163. Comparison of demographic information of the sample to county welfare 
clients showed that the sample was more predominantly female and less 
predominantly minority than the general welfare population. A comparison of 
participants in the study with those who elected not to participate showed that 
participants were more likely to be female, less likely to be minorities, and had 
slightly higher initial BASIS test scores. 

Findings 

Learning Gains 

Because the BASIS test and the CASAS test differed in the ranges they 
measured accurately, the researchers reported two sets of figures: gains for the 
entire group and gains only for those who fell within the accurate range of the test. 
For reading, only 53 persons fell within the range of the test. Mean scores were 
23 1 .3 on the BASIS pretest and 235.5 on the CASAS posttest for a gain of 4.2 {p 
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< .001). Of the 53 subjects, 37 showed gains in reading, 3 showed no difference, 
and 1 3 demonstrated declines. 

For math, the entire group had pretest scores of 221.1 and posttest scores 
of 229.0 for a gain of 7.9 {p< .0001). One hundred and thirty-five of the 163 
cases showed gains, 5 showed no difference, and 23 demonstrated declines. For 
the group that fell within the accurate range of testing on math («=127), the mean 
pretest score was 217.9 and the mean posttest score was 226.7 for a gain of 8.8 {p 

< .0001). One hundred and thirteen of the 127 showed gains, 5 showed no 
difference, and 9 had declines. 

Of the 79 subjects who lacked a GED when they entered Steps, 22 (28 
percent) had acquired a GED during the follow-up period. When subjects were 
asked whether Steps had helped them improve their basic skills, for math 37 
percent responded with improved “a lot,” 41 percent responded with improved 
“somewhat,” and 21 percent said “no change.” Scores for reading were 37 
percent for “a lot,” 42 percent for “somewhat,” and 26 percent for “no change.” 
For writing, scores were 29 percent for “a lot,” 48 percent for “somewhat,” and 27 
percent for “no change.” 



Employment 

Forty-nine percent of the respondents reported that Steps had helped them 
to acquire employment. Thirty-eight percent reported that they were employed, 
and of these, 57 percent were working full-time and 43 percent were working 
part-time. Presumably, none of the subjects were employed before they entered 
Steps and enrolled in ABE/GED. 

Further Education 

Sixty-seven percent indicated that they had participated in additional 
training or education after leaving Steps. 

Termination of Welfare 

A discriminate analysis showed that math learning gain and whether the 
subject were currently employed were associated with whether a subject was still 
on welfare after terminating Steps. 
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Strengths 

• The report is very well documented with respect to methodology and study 
limitations. The discussion of literature on welfare and adult literacy is 
excellent. 

Limitations 

• Data were collected on only 163 of the 920 intended subjects. Study 
participants differed from non-participants with respect to gender and minority 
status. All subjects volunteered for the study. This raises the possibility of 
selection bias. 

• Only 53 subjects fell within the accurate range of the reading tests and 127 fell 
within the accurate range of the math tests. 

• Data were collected on subjects between one and three years after termination 
from Steps. Thus without a control group it is difficult to know whether 
unknown factors, such as continued education, intervened in the gains noted. 

• The perception of program success data is based on self-report. 



WORKPLACE LITERACY 

In recent years there has been a considerable increase in the attention paid 
to workplace adult literacy education. Much of this attention can be traced to a 
shift in public policy toward the human capital aspects of adult literacy 
education — the idea that a primary function of adult literacy is to enhance 
individual and societal economic productivity. Workplace literacy is defined as 
adult literacy education conducted in learners’ places of employment to enhance 
individual job performance. 

In the early 1990s the number of workplace literacy programs has 
increased considerably by a provision of the Adult Education Act that mandated a 
set-aside for federally-funded workplace education programs. As the number of 
workplace literacy programs increased, so did the number of impact evaluations 
available. However, as Kutner, Serman, and Webb (1991) and Mikulecky and 
Lloyd (1996) have noted, most of these evaluations are primarily descriptive and 
seriously flawed in their research designs and procedures. The workplace literacy 
evaluations presented here are the most credible of the many evaluations we 
initially reviewed. 
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17. Mikulecky and Lloyd 



Mikulecky, L., & Lloyd, P. (1996). Evaluation of workplace literacy programs: A 
profile of effective instructional practices. Philadelphia, PA: University of 
Pennsylvania, National Center on Adult Literacy. 



Rather than being a report of a single workplace evaluation, this sought to 
develop and refine a workplace literacy evaluation model. The report is reviewed 
here because, in the process of developing the model, several workplace literacy 
programs were assessed. The focus of the assessment was changes in learners’ 
beliefs about personal effectiveness with literacy, changes in learners’ literacy 
practices, learners’ literacy improvement with general and workplace materials, 
and changes in learners’ goals. Data were collected by interview from 10 groups 
of learners (n=181) in six different companies in a pretest, posttest format. Pretest 
data were collected at the beginning of each course and posttest data were 
collected toward the end of each course. Learners were asked about themselves, 
their literacy abilities, their reading and writing practices, how they read print, and 
their future educational plans. 

Data on learners’ literacy practices, beliefs, and plans were gathered 
through open-ended questions. For example, the question posed for learners’ 
literacy practices was, “Tell me the sorts of things you read on the job during a 
normal week.” Reading process was aissessed with job-specific scenarios in 
which learners were presented with job-related materials and asked to explain how 
they read them. 

Quantifiable interview responses were analyzed statistically. For the 
open-ended questions, categories were allowed to emerge from the data, and, 
when categories were sufficiently refined to yield an inter-rater reliability of 90 
percent or more, they were analyzed quantitatively. For other open-ended 
questions, a holistic comparison was made between pre- and post-scores, and the 
change wais rated either positive, neutral, or negative. 

The evaluation itself wais baised on measuring the relationship between 
various course characteristics and on the data collected from learners. Course 
characteristics meaisured included total instructional time, topic (workplace and 
home/family orientation), and talk (discussion of literacy beliefs, plans, and 
reading writing processes). Data for course characteristics were collected from 
course syllabi, other descriptive information, and observation. Ratings of course 
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characteristics were made by two researchers who compared results, discussed 
differences in ratings, and came to consensus. All classes enrolled between 10 
and 15 learners and most were 20 to 50 hours in duration. Data were analyzed 
using analysis of variance (ANOVA). 

The six workplace literacy program sites were: 

1 . A large manufacturing plant that offered GED preparation for 4 hours per 
week over 6 weeks and ESOL for 8 hours per week for 6 weeks. 

2. A women’s prison at which staff participated in report writing and instruction 
that enhanced skills needed for promotion. For report wnting, 28 staff 
participated for three hours per week for 13 weeks; for promotion support, 9 
participated for 3 hours per week for 7 weeks. 

3. A small insurance company . Twenty learners participated for about 40 hours 
to improve job-related reading and writing. 

4. A hospital in which 19 learners attended a computer-based vmting course for 
20 hours. 

5. A large gasket maker. Ten learners studied job-oriented reading and writing 
for 50 hours. 

6. A company that manufactured electric motors where 33 learners studied 
reading for 30 hours. 

Findings 

Findings are aggregated for all six programs studied. Learners with 200 
hours of instruction (all subjects were enrolled in one program) made significant 
gains in practice away from work. Learners with 50 or less hours of instruction 
did not make significant gains. 

With respect to the self-reported sophistication in the reading processes 
they would employ, those who spent over 70 percent of their time in class in 
reading and writing had a mean gain of over three times that of learners who spent 
less than 70 percent of their time on reading and wnting. 

All learners gained in job-related scenario comprehension, but learners 
who used job-related examples in class 20 percent or more of the time gained 
almost twice that of learners who did not. Learners who had discussions of 
literacy beliefs and plans as deliberate components of their classes scored three 
times higher on reading scenarios than did learners who did not. 
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Learners who had deliberate discussions of literacy beliefs as part of their 
classes made significant gains in literacy self-efficacy, while learners where such 
discussions were merely incidental did not. 

Learners gained in the area of specific, detailed future plans if they spent 
over 50 percent of their time in class reading and writing, if they used workplace 
examples in class at least 20 to 30 percent of the time, if they had deliberate 
discussions about literacy beliefs and plans, and if they had deliberate discussions 
about reading processes. Other learners did not make such gains. 

Strengths 

• The conceptual scheme and variables studied seem particularly relevant for 
workplace literacy. 

Limitations 

• The ANOVA is based on the aggregated data from six workplace literacy 
programs. The small number of programs interjects the possibility that 
unreported differences among programs may have influenced the findings in 
unknown ways. For example, findings on the relationship between practices 
away from work and hours of instruction divide hours of instruction into two 
categories; 50 and below hours and 200 hours. However, only one program, a 
large manufacturing plant, offered 200 hours of instruction. Thus it is not 
known if the ANOVA actually measured hours of instruction or other 
characteristics associated with this one program. 

• The definition and meaning of variables is not always clear. 

• The study lacks a control or comparison group. 

• To a great extent the study relies on learner self-report, although such data 
were collected quite systematically. 

18. Manufacturing and Financial Services in the Chicago Area 



Mrowicki L., Jones, D., Locsin, T., Olivi, L.. & Poindexter. C. (1995). Workplace 
literacy in a total quality management environment for the manufacturing and 
financial services industry. Final performance report. Des Plaines, IL: The 
Center-Resources for Education. 
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The goal of this workplace literacy project was to improve the productivity 
and efficiency of 19 manufacturing companies and two financial institutions by 
providing education to workers lacking basic skills. Customized cumcula and 
materials were developed. Of the 21 companies that participated, 10 conducted 
ESOL classes, six had math classes, nine offered reading and writing, one offered 
only writing, and one conducted a communications class. Overall, 161 classes 
were conducted. The average number of classes per company was eight; 1,526 
learners were served. 

Most of the report provides descriptive information on the various 
companies’ activities. The evaluation of the project employed a modification of 
the Context-Input-Process-Product (CIPP) model of program evaluation. Context 
was defined as the shared goals and philosophy of key personnel and participants. 
Input included resources such as personnel, materials, time, and facilities. Process 
was defined as the extent to which instruction was congruent with project goals. 
Product was defined as indicators of project effectiveness. Because of limited 
resources, site visits were restricted to five companies that were considered to be 
representative of project operations. These five were visited four times and three 
other sites were visited once. The five companies were: Avon Products, R. Olson 
Manufacturing, Allied Die Casting Company, E.J. Brach, and First Chicago 
National Bank. 

Data for the evaluation were collected through post-program participant 
surveys; structured interviews with participants, instructors, and program 
personnel; observations of instruction; reports of instructor training; participant 
records; and pre- and posttest scores 

Findings 

From post-program surveys and on-site interviews it was determined that 
most learners were satisfied with course content, and the confidence to use learned 
skills was built among participants. One hundred percent of those surveyed said 
that they would recommend participation in the program to others. 

Pre- and posttests were administered to a limited sample (the selection of 
the sample, its size, and the test are not described). The mean pretest score was 
54 percent and the mean posttest score was 80 percent for a gain of 26 percent. 
Instructor and company representatives’ comments about the program were 
positive. 
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Strengths 

• The study presents a useful descriptive picture of the program. 

Limitations 

• The description of methods and procedures is very sparse, and important 
information such as sample selection procedures, sample size, and the nature 
of the test used is not explained. As a result, it is difficult to assess the 
validity of the study. 

• Most of the data is anecdotal and based on self-report. 

• There are no control or comparison groups. 

19. Workplace Literacy across the Three Phases of Textile Manufacturing 



Fisk, W. (1994). Workplace literacy across the three phases of textile 
manufacturing. National workplace literacy demonstration project. Performance 
report. Clemson, SC: Clemson University. 



The goals of this workplace literacy project were to promote and improve 
employees’ literacy skills in order to increase worker productivity in a textile 
manufacturing plant. Task analysis preceded the development of a job-relevant 
curriculum that focused on math, reading, and communication skills. Instruction 
was integrated into the three phases of the manufacturing cycle — spinning/ 
weaving, finishing, and fabrication — and was conducted in two plants. Two 
hundred and forty participated in classes during the grant period. Ninety-nine of 
these participated in GED preparation classes. 

Although the evaluation design had both formative and summative 
components, all the impact data are found in the summative section. Data were 
collected through seven site visits; interviews with project supervisors and 
participants; and questionnaires administered to teachers, participants, and plant 
supervisors. Instrumentation included teacher-made tests. In some cases pre- and 
posttests were administered, in others only posttests were given, and in some 
cases, no tests were used. Data on the validity and reliability of the tests is not 
given. A supervisor’s questionnaire was developed by the evaluator to measure 
supervisors’ perceptions of project implementation and impact on participants. 
Employee’s and teacher’s questionnaires were developed for the same purpose. 

In addition, a six-item course evaluation form was administered by project staff 
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and the evaluator at course completion. No information is provided regarding 
sample sizes or survey return rates. 

In addition a “long-term” follow-up was conducted 10 months after the 
termination of classes conducted during the first project grant period with a 
random sample of 24 employees. A 21 -item survey was used for this purpose. 

Findings 

Plant one: In one department 13 students had completed math pre-and 
posttests and 1 7 had completed reading tests. These students demonstrated 
statistically significant increases in math and vocabulary. In another department, 
math scores increased significantly. Some participants exhibited changes in 
communication behavior after attending class. Many students refused to take the 
tests. 



Plant two: Because this plant changed from classroom instruction to 
computer assisted instruction, it was difficult to measure tested learning gains of 
the 113 learners who participated. Learners made significant gains in math, and 
there were positive changes in learners’ attitudes after attending the 
communications class. 

Although it was originally intended to measure the program’s impact on 
productivity, this plan was abandoned because it was thought that the seven- to 
eight-week classes were of insufficient duration to create measurable impacts on 
productivity. 

With respect to the study conducted 10 months after completion of the 
first year project (the evaluation reviewed here is of the second year of operation), 
it was found that, of the 186 fabrication plant project participants, 74 (40 percent) 
voluntarily enrolled in additional courses for an average of 22 hours of instruction 
per employee. Of the 24 randomly selected participants who were administered 
the follow-up survey, data indicated that the project had a positive impact on 
interactions with co-workers and supervisors on the job, interactions with their 
families, and satisfaction with their daily lives. Positive work habits were formed 
and employees accepted increased responsibility on the job, in the family, and in 
the community. 



94 

98 

o 

ERIC 



NCSALL Reports #6 



January 1999 



Strengths 

• The report contains a useful descriptive account of workplace literacy 
activities. 

Limitations 

• Important information, such as survey return rates, is not supplied. 

• Much of the data is self-reported. 

• In many cases the sample size is quite low. 

• There is no control or comparison group. 

• The validity and reliability of the tests used is not reported. 



20. Wisconsin Workplace Partnership Training Program 



Paris, K. A. (1992). Evaluation of the third year of the Wisconsin Workplace 
Partnership Training Program, March 1, 1991 through August 31, 1992. 
Madison, WI: Center for Education and Work, University of Wisconsin, Madison. 



The goal of the Wisconsin Workplace Partnership Training Program was 
to provide job-specific basic skills instruction to 3,066 employees at their work 
sites in order to promote job retention and/or job advancement and increased 
productivity. At the state level, the program was a cooperative effort between the 
Wisconsin Board of Vocational, Technical and Adult Education, the Wisconsin 
State AFL-CIO, and Wisconsin Manufacturers and Commerce. At the local level, 
there was a partnership between local Vocational Technical and Adult Education 
colleges, unions, and employers. The program operated in 23 work sites. 
Instruction focused on job-related reading, verbal and written communication, 
listening, math, reasoning and problem-solving, and use of English. Four research 
questions were addressed: 1. To what extent do program participants agree they 
have achieved academic and job-related objectives? 2. To what extent do local 
partners agree that participants achieve academic and job-related objectives 
through participation in the program? 3. Which program objectives do participants 
and local partners view as most significant? 4. What are some of the best practices 
exhibited by programs whose participants report the highest mean improvement in 
academic skills and job performance? 
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Two surveys were designed to collect data. The first was administered to 
participants to measure the extent to which they perceived their basic skills and 
job-related skills had improved. The second was administered to local partners to 
determine their perceptions regarding how much participants’ skills had 
improved. Because a field test indicated that learners would have difficulty 
reading the instrument, the survey was verbally administered to participants by 
the researchers. The participant survey was administered at 10 randomly selected 
program sites between April and May 1992 to all learners who were present on 
days of the site visits. One hundred and two participants — 3 percent of all 
enrollees in the program — were interviewed; the response rate was 97 percent. 

For the local partners survey, during the site visits surveys with self- 
addressed envelopes were given to the union liaison to the program, the human 
resource director, the CEO or highest official at the site, three to six peer advisors 
identified by instructors, teachers, and the Vocational Technical Adult Education 
supervisor. Non-respondents were contacted by telephone to promote their 
participation and data were tabulated both by site and in the aggregate. One 
hundred and ninety-seven local partners surveys were administered and 160 were 
returned for a response rate of 8 1 percent. 



Fifty percent of the learners had attended between one and six months; 24 
percent had attended seven to twelve months. 

Participants ’ Views 

All respondents indicated the extent to which they had improved their 
basic skills on a 5-point Likert scale on which 5 indicated “strongly agree.” 

Mean scale scores were as follows: math (4.5), writing (4.2), reading (4.4), 
speaking (4.0), ESOL (4.6), GED (4.4), and computer (4.6). Similarly, using the 
same scale, respondents were asked the extent to which they had improved job- 
related skills. Scores were: job skills in general (4.2), getting along better with 
employees (3.4), getting along better with superiors (3.9), problem-solving (4.1), 
quality (4.1), self-image (4.3), eligible for promotion (1.3), and job enjoyment 
(2.3). Of all the skills listed, both basic and job-related, only eligible for 
promotion and job enjoyment received mean scores below 3, the scale midpoint. 

Ninety-one percent agreed or strongly agreed that they were satisfied with 
the progress they made in the program. Flowever, only 3 percent of the 
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respondents received promotions and 10 percent were transferred to another job. 
When months of attendance were analyzed in relation to satisfaction with an 
analysis of variance, no significant relationship was fovmd. 

Local Partners ’ Views 

Local partners were asked about the extent to which they perceived 
participants’ basic and job skills had improved. The same 5-point Likert scale 
was used to record responses. For basic skills the scores were: math (4.6), writing 

(4.3) , reading (4.4), speaking (3.7), ESOL (3.5), GED (4.4), and computer (4.4). 
For job skills the scores were: job skills in general (3.8), getting along better with 
employees (3.5), getting along better with superiors (3.1), problem solving (3.8), 
quality (3.3), self-image (4.4), eligible for promotion (3.0), and job enjoyment 

(3.4) . Eighty-five percent of local partners believed that participants were 
satisfied with their progress in the program. 

An analysis of variance conducted on the type of local partner respondent 
showed some significant differences. With respect to mean improvement in total 
job-related improvement, instructors (mean=4.0) rated learner improvement 
higher than company officials (mean=3.3). With respect to perceptions that 
participants are promoted, instructors responded with a mean of 3.4 whereas 
company officials responded with a mean of 2.3. 

Analysis of variance also showed that in general participants' ratings of 
their improvement in both basic skills and job-related skills were significantly 
higher than those of local partners. Local partners perceived that participation in 
the program led to promotion to a significantly greater extent than did 
participants, although neither group perceived that the effects of participation on 
promotion were substantial. Local partners also responded with significantly 
higher scores for job enjoyment than did participants. 

Both participants and local partners were asked which program objectives 
they felt were the most important. The three highest for participants were 
computer skills, math, and self-image; for local partners the three most important 
were self-image, math, and reading. 

In general the results of the study suggest that while both learners and 
local partners were satisfied with the gains learners had achieved in basic and job- 
related skills, these gains had not been translated into job-promotion benefits for 
workers. 
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Strengths 

• Explanation of methods and procedures and the presentation of findings are 
clear and complete. 

• The analysis of variance component of the study is a useful elaboration of the 
basic findings. 

• Although the sample size is relatively small, it is adequate and the response 
rate is high. 

Limitations 

• Although the 10 sites from which data were collected were randomly selected, 
the subjects for the study were not. 

• All data are based on participant and local partner self-report. 

• There are no control or comparison groups. 

21. Central Labor Council and the Consortium for Worker Education 



Gross, A. L., & Feldman, S. (1990). Evaluation report: The workplace education 
program of the Central Labor Council and Consortium for Worker Education. 
New York: Center for Advanced Study in Education: The Graduate School and 
University Center, City University of New York. 



This study evaluated the workplace education program of the New York 
City-based Central Labor Council for Worker Education. Data were obtained 
from 15 union-based workplace literacy programs for the time period October 1, 
1989, to June 30, 1990. During that time, the program enrolled 3,775 learners and 
conducted 215 different classes. ESOL, basic education, and union-specific skills 
courses were offered. Basic education classes ranged from basic reading and 
writing to GED preparation. Union-specific classes focused on job-related 
literacy skills. Classes ranged from 2 to 49 weeks in duration and met from 4 to 
40 hours of instruction per week. Mean contact hours were: ESOL, 62.3; basic 
education, 70.3; and union-specific, 33.1. Variables measured included learners’ 
attainment of their personal goals and tested learning gain. 

Data were learner records compiled by the New York Literacy Assistance 
Center which contained pre- and posttest scores, information on contact hours, 
information on attainment of learners’ objectives, and background data. The 
researchers encountered several problems with the data set including: 1. contact 
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hours were not recorded for nearly one third (1,357) of the 3,773 cases; 2. there 
were substantial data entry errors; and 3. in many cases there were multiple 
records for students. The implications of these problems are explained by the 
study’s authors in their discussion of findings on learners’ personal goal 
attainment; 

A second problem concerns the manner in which an objective was 
designated as not being achieved. For the n = 3,259 students for whom 
class type was established, only n = 995 achieved their stated objective. 

Of the n = 2,264 “non-achievers”, n = 794 were simply missing from the 
Impact File, n =927 had a “blank” recorded, and n = 543 achieved 
objectives which did not match the originally stated objective. It is not 
clear whether all 2,264 cases represent failure to achieve an objective or 
the failure to record the data. (Gross & Feldman, 1990, p. 23) 

The tests used were the Test of Adult Basic Education (TABE) for adult 
basic education, the JOHN and Davis tests for ESL, and various job-related tests 
for union-specific education (administered in some but not all cases). Although 
the validity and reliability of the TABE has been established, its appropriateness 
for job-related basic skills education is an issue. Information on when and how 
these tests were administered is not provided. For example, the time intervals 
between pre- and posttests are not reported. 

Findings 

Learners ' Attainment of Personal Goals 

When learners were asked to specify their learning goals, they were 
presented with 1 1 response options. For all types of classes combined, the most 
commonly stated goal (46 percent) was to prepare for a better job. Twenty-four 
percent of those who stated this goal reported that they had achieved it. The 
second most commonly stated objective was to learn English (19 percent), and 18 
percent of those who stated this objective claimed they had achieved it. 

For ESOL classes, the most commonly stated objective was to learn 
English (60 percent), and 17 percent of those who reported this objective said they 
had achieved it. To acquire citizenship was the second most important objective 
for ESOL students (3 1 percent), and 7 percent reported they had achieved it. 

For basic education GED preparation was the most common objective (38 
percent), and 13 percent claimed it had been achieved for them. To prepare for a 
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better job (30 percent) was the second most commonly stated objective (30 
percent), and 64 percent stated that this objective had been achieved. 

For union-specific course learners, to prepare for a better job was the 
overwhelmingly most common objective (71 percent), and 22 percent reported 
they had obtained it. 

Overall, for ESOL 152 learners achieved their goals and 841 did not. For 
basic education 339 learners achieved their goals and 402 did not. For union- 
specific courses 504 learners achieved their goals and 1,021 did not. The overall 
achievement of objective rate was 3 1 percent. The proportion of learners attaining 
the objectives that were presented to them varied greatly by program, with the 
highest scoring program achieving a 95 percent goal attainment rate and the 
lowest scoring achieving a 0 percent goal attainment rate. Although on the face of 
it findings on learner goal achievement seem to suggest rather low goal 
attainment, the data analysis problems reported earlier may well have resulted in 
an underestimation of learner goal attainment. It may also have been that the 
instructional goals of some of the union programs did not match well the goals 
contained on the response list presented to learners. 

Tested Learning Gain 

Pre- and posttests were only administered in 9 of the 1 5 programs. Of the 
three programs that offered ESOL, two showed statistically significant gains, one 
did not. Of the programs that offered basic education, three showed gains, one 
showed a decline and one showed no gain. Of the programs that conducted 
union-specific education, three showed gains and one did not. 

Strengths 

• Methods and procedures are clearly explained and the report is honest about 
its limitations. 

• The sample size is adequate. 

Limitations 

• Problems with data collection and data entry created inaccuracies and missing 
data in the data set. For this reason, findings on student goal attainment are 
suspect. 
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• Learners were presented with a choice of 1 1 goals as response options for 
learner goal attainment. How relevant there response options were to the 
instruction offered is not clear. 

• The relevance of the TABE, JOHN, and Davis tests for workplace-oriented 
instruction is an issue. 

• Learners’ goals data are based on self-report. 

• The study lacks a control or comparison group. 



In family literacy, low-literate adults and their pre-school children are 
educated together to create a presumably synergistic effect on educational gains 
for both. Family literacy takes many forms. Even Start, the federally-funded 
family literacy program, requires programs to offer parent education, adult 
literacy education, and early childhood education. Support services such as 
transportation, home visits, and child care are generally offered as well. 

The Kenan Family Literacy Model program is another well-known and 
widely implemented model for family literacy. Programs that follow this model 
must provide adult literacy education and early childhood education plus two 
additional components. Parent Time and Parent Time Together (National Center 
for Family Literacy, 1992). In Parent Time, parents and their children design and 
participate in programs of interest to both, such as child nurturing and managing 
and coping with child behavior. In Parent Time Together, parents and their 
children play together to promote interaction and improved communication. 
Provision of pre-employment training and service integration are encouraged, but 
not required. Family education programs vary in the extent to which adult 
literacy education is specifically tailored to meet parenting objectives and the 
degree to which the components of the program are integrated. They also vary 
with respect to the provision of support services, such as tutoring, childcare, and 
home visits. 



St. Pierre, R., Swartz, J., Murray, S., Deck, D., & Nickel, P. (1993). National 
evaluation of the Even Start Family Literacy Program: Report on effectiveness. 
Washington, DC: U.S. Department of Education, Office of Policy and Planning. 



FAMILY LITERACY 



22. The National Even Start Program 
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St. Pierre, R., Swartz, J., Gamse, B., Murray, S., Deck, D., & Nickel, P. (1995). 
National evaluation of the Even Start Family Literacy Program: Final Report. 
Washington, DC: U.S. Department of Education, Office of Policy and Planning. 



Even Start is a federally-funded family literacy program that commenced 
in 1988. In 1991-1992, 234 projects were in operation throughout the 50 states, 
and 9,690 families received some core services, 13,541 children received early 
childhood education, and 10,800 parents received adult literacy education or 
parent education. To be eligible, the law requires that a family have an adult who 
is eligible for adult literacy education under the provisions of the Adult Education 
Act, who is a parent with a child under eight years old, and who lives in a Chapter 
1 school area. Even Start programs must have at least three components: adult 
basic education, early childhood education, and parenting education. They are 
also required to provide home-based services and are encouraged to offer 
supportive services such as transportation to programs and childcare. 

The national evaluation of Even Start was mandated by the authorizing 
legislation and was contracted to Abt Associates with a subcontract to the RMC 
Research Corporation. The evaluation began in 1990 and ended in 1993. Four 
research questions were posed: 1. What are the characteristics of Even Start 
participants? Who is in the program? 2. How are Even Start projects implemented 
and what services do they provide? 3 . What Even Start services are received by 
participating families? 4. What are the effects of Even Start on participants? The 
impact component of the study addressed by question four had two major 
components. The first, the National Evaluation Information System (NEIS), was 
a data set that contained descriptive information collected from Even Start local 
programs. The second was an in-depth study of 1 0 programs. The Even Start 
national evaluation measured effects on children, parents, and families. This 
report focuses exclusively on the assessment of the adult literacy education 
component. 

The NEIS 

The NEIS collected descriptive longitudinal and cross-sectional data on 
four cohorts of participants, projects first funded in 1989, 1990, 1991, and 1992. 
For the first two cohorts, data were collected annually through completion of data 
collection forms, parent interviews, testing of children and adults, and completion 
of program logs. Data were collected from families at entry to Even Start, at the 
end of each program year, and at exit from the program. Families who did not 
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remain in the program long enough to be posttested were eliminated from the 
analyses. Data were collected by local program staff who were trained by the 
researchers and who were supported by a portion of the local program grant 
earmarked for this purpose. The CASAS test was administered to participants of 
adult literacy education. For the second two cohorts, a computerized data 
collection system was used that allowed local programs to submit their data on 
diskette for direct computer input. However, because many local program staff 
were not familiar with computers, many failed to use the system or used it 
incompletely. 

In-depth Study 

The in-depth study was designed to complement the NEIS. It was 
implemented with 10 projects selected from the first cohort of 73 projects on the 
basis of geographic location, level of program implementation, and willingness to 
participate in the study. As intended, 40 families for each project were to be 
randomly assigned to either a participant or control group (20 in each group). 
However, only five of the ten programs were able to implement the random 
design. Study participants were pretested with tests that included the CASAS test 
in the fall of 1991 and were posttested in summer of 1992 (9 months) and spring 
of 1993 (18 months). Tests were administered by a private contractor. 

Comparison of the control group («=98) with the participant group (n=101 ) 
showed no significant differences with respect to subjects' background 
characteristics. However, comparison of the in-depth study participants with the 
NEIS sample showed differences, especially with respect to Spanish as the 
primary language and Hispanic ethnicity. 

Effect estimates were based on a regression model in which posttest was 
used as the dependent variable and pretest and group assignment were used as the 
independent variables. Effect magnitudes were calculated by subtracting pre-post 
test gains of the control group from those of the participant group and by then 
dividing by the standard deviation of the control group pretest scores. Adult 
literacy education outcome measures used included scores on the CASAS test, 
GED attainment, and reading/writing activities at home. 

The National Evaluation of the Even Start also measured the effects of the 
program on children using the PreSchool Inventory (PSl) and the Peabody Picture 
Vocabulary Test (PPVT) as outcome measures. The 32-item PSl measures school 
readiness skills such as identifying shapes and colors and numerical skills. The 
PPVT measures receptive vocabulary. In general, the results of the effects on 
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children are complex. At issue for this study is the extent to which parents’ 
participation in the adult literacy education component of Even Start affected their 
children’s gains on the PSI and PPVT. This question was addressed by regressing 
children’s posttest scores on parents’ background and program participation 
variables. 

Findings 

Tested Learning Gain 

For tested learning gain the CASAS Reading Survey was used. This test 
has four levels with 24 to 40 items per level. The CASAS has been used 
extensively as a learning gain measure in adult literacy. The most recent national 
evaluation of the federal adult literacy program and the California GAIN Study 
were two studies reviewed in this report that have used the CASAS. 

In the in-depth study, valid pre- and posttests were obtained from 64 adult 
literacy education participants and 53 control group members. At the time of the 
second posttest (18 months), a statistically non-significant difference of 3.7 points 
on the CASAS was found between the gains of the participant and control groups, 
indicating that participation in Even Start adult literacy education had no 
significant effect on tested learning gain. 

However, both the participant and control groups gained significantly on 
the test. This findings was perplexing since the control group presumably 
received no instruction. Further analysis showed that nearly one quarter of the 
control group members reported they had participated in adult literacy education 
at the time of both the pre- and posttests. This and testing error may well have 
reduced the differences in gain exhibited by the two groups and accounted for the 
gains among the control group. With respect to the NEIS data, gains were small 
(4.6 points) but statistically significant after 70 hours or more of instruction and of 
about one third of a standard deviation in magnitude. 

It should be remembered that the in-depth study included all participants 
who were pre- and posttested regardless of whether they remained in the adult 
literacy education program, whereas the NEIS sample included only those 
participants who were still active in the program and had received at least 70 
hours of instruction. Both the in-depth study and the NEIS data show a 
significant relationship between the amount of adult literacy education instruction 
and gains on the CASAS. NEIS data show mean gains of 3.0 points between I 
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and 69 hours of instruction, gains of 4.3 points between 70 and 200 hours of 
instruction, and gains of 5.2 points for over 200 hours of instruction. NEIS data 
also show gains were highest for those who entered with the lowest CASAS 
scores. 



GED Attainment 

Data from the in-depth study show that 22.4 percent of the participant 
group and 5.7 percent of the control group members attained a GED over the 18- 
month course of the study. For the NEIS sample, of those who had at least a 
ninth-grade education at the time of initial data collection, 14.1 percent had 
obtained a GED within a year or less. Attainment of a GED was related to grade 
level at intake, CASAS score, hours of instruction, younger age, and having 
English as the primary language. 



Reading and Writing in the Home 

Reading and writing in the home was measured by 13 self-report questions 
that asked about the frequency parents read various types of reading material and 
a 1 1 -item self-report measure focusing on writing activity. Frequency was 
measured on a scale that ranged from 1 to 3. The instrument was administered 
only to in-depth study participants. At pretest, all study participants scored at 
about the middle of the measure (2 on the 3-point scale). No significant program 
effects for either reading or writing in the home were found after 1 8 months. 

Effects on Children ’s Literacy-Related Skills 

Although the amount of time parents participated in parenting education 
was positively related to their children’s PPVT (which measures receptive 
vocabulary) scores, the amount of time parents participated in adult literacy 
education was not significantly related. Likewise, there was no relationship 
between children’s scores on the PSl (which measures school readiness) and 
parents’ participation in adult literacy education. Remarking on these findings the 
study’s authors noted: 

Parenting education activities are often targeted at developing parents’ 
abilities as teachers of their children, and children’s language development 
is exactly what is taught through many of the parenting activities 
emphasized in Even Start, such as reading to children. On the other hand, 
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there is no special reason that participation in adult education programs, 
which focus only on the development of adult-level skills, should yield 
immediate benefits to children. (St. Pierre et al., 1995, p.l80) 

Strengths 

• The reports provide an adequate description of methods and procedures and 
the findings are clearly presented. 

• The in-depth study includes random assignment to control and participant 
groups. 

• The NEIS data were systematically collected and are comprehensive in the 
variables studied. The sample size for the NEIS is large and adequate. 

• The study is longitudinal, measuring program effects at 9- and 1 8-month 
intervals. 

Limitations 

• The sample size for the in-depth study was small. 

• Random assignment to treatment and control groups was implemented in only 
five programs for the in-depth study. Thus differences in program 
characteristics and contexts could have influenced the findings. 

• As the report acknowledges, the CASAS test may not have been sufficiently 
sensitive to measure literacy gain. 

23. The National Center for Family Literacy 



Hayes, A., E. (1997). The power of Even Start family literacy: A summary of 
findings from a follow-up study. Louisville, KY: The National Center for Family 
Literacy. 



The purpose of this study was to determine the long-term effects for a 
group of Even Start programs that met the National Center for Family Literacy’s 
quality standards. The rationale for the study was that, because programs 
evaluated in the National Even Start Evaluation did not necessarily meet the high 
quality standards established by the National Center for Family Literacy, the 
effect of high quality standards was “obscured.” Program sites were selected 
through nomination by “leaders in the Even Start community” and by external 
evaluators familiar with Even Start programs. Data were collected from 1 5 



106 



110 



NCSALL Reports #6 



January 1999 



programs that served a total of 507 children and 508 adults. Data were 
supplemented by the evaluation reports of several other family literacy programs. 

Data were collected in January and February 1997, one to six years after 
families had terminated the program, by local site coordinators who used forms 
and procedures developed by the National Center for Family Literacy. Most of 
the study focused on effects on children; only the impacts of adult literacy 
education are reported here. 

Findings 

Sixty-two percent of those for whom attainment of high school 
certification was an appropriate goal attained high school certification. Fifty 
percent either obtained a job or obtained a better job. Forty percent enrolled in 
higher education. Forty-two percent of the 260 former participants who received 
welfare when they enrolled in the program reduced the amount of public 
assistance they received. 

Strengths 

• The sample size is adequate. 

Limitations 

• The report lacks important information regarding research design and 
procedures. Most notably, information on sample selection is omitted. 

• The programs from which research subjects were drawn were selected because 
they reportedly met high standards. Thus the sample is not representative. 

• The study lacks a comparison or control group. 

• Presumably, most findings are based on self-report. 
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CHAPTER 4: CONCLUSIONS 

In accordance with the goals of this study, this chapter addresses four 
questions: 1. Based on the case analyses of the outcome and impact studies 
reviewed, how effective is the adult literacy education program in the United 
States? 2. What are the common conceptual, design, and methodological problems 
in the studies analyzed? 3. What are the implications of the analysis for adult 
literacy education policy? 4. What recommendations are warranted? 

HOW EFFECTIVE IS THE ADULT LITERACY 
EDUCATION PROGRAM? 

The 23 studies reviewed in-depth for this report were chosen because, with 
respect to their design, methodologies, and reporting, they were the most credible 
of the approximately 1 15 studies initially reviewed. However, as those who have 
read the case studies can attest, these 23 studies vary considerably in their 
strengths and limitations and none is definitive to the extent that it alone can put 
the question of impact to rest. 

These studies represent evidence rather than proof of impact, and, like 
evidence in a trial, their findings must be weighed in order to reach reasonable 
conclusions. Weighing has two dimensions. The first is the extent to which the 
vairious studies converge or diverge with respect to their findings on specific 
outcome and impact vairiables. Consensus among studies points towards 
effectiveness/ineffectiveness while lack of consensus suggests an inconclusive 
resolution. The second dimension is the credibility of the individual studies. 
Obviously, when arriving at conclusions, more credible studies must be weighed 
more heavily than less credible studies. 

The conclusions set forth here are deemed to be reasonable inferences 
from the findings reported in the case studies. They do not represent proof 
Indeed it is unlikely that any conceivable study or studies could arrive at certainty. 
Table 1 presents the data used for this analysis. 
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Table 1: Results of the Case Studies 
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Along the top of the table are the outcome and impact variables addressed by a sufficiently large 
number of studies to permit inferences: 



empl = gains in employment 
bjob = acquisition of a better job 
inc = increased income 
ced = continued education 
wel = termination of or reduction in 
public assistance 

read = self-reported gains in reading 
write = self-reported gains in writing 



math = self-reported gains in math 
TLG = tested learning gain 
GED = GED acquisition 
sif-c = self-confidence, self-esfeem. or self- 
confidence 

child = impact on children's education 
pgoal = attainment of learner's personal 
goals 



Studies are listed in the order in which they were reviewed in Chapter 3. 

Study codes (STUDY): N=national study, S=state, W=welfare, WO=workplace, FL=family 
literacy. 

Methods (METH): P = post measurement only, PP = pre-post, L = longitudinal measurement, 

C = control or comparison group. 

y = study found impact on this variable, ? = inconclusive findings, n = study found no or negative 
impact, blank cells = variable not studied. 
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In interpreting Table 1 and the conclusions made from it, three caveats are 
in order. First, the variables included are those studied by a sufficiently large 
number of studies to enable reasonable conclusions. However, variable 
definitions and their units of measure vary among studies. In some studies, for 
example, learning gain is measured by the CASAS, while in others the TALS or 
TABE are used. Second, if a given study reported a gain, the gain is listed as 
positive (y) in the table irrespective of the size of the gain or the quality of the 
study’s methodology. In some cases the gains reported as positive are quite small, 
and in some cases the limitations of the study render claims of gains suspect. 
Third, the totals are aggregates of studies conducted at different times and on 
different populations of adult literacy learners, welfare clients and employees 
being examples. Drawing conclusions from such aggregates presumes that doing 
so is both valid and meaningful. 



Employment 

Eleven studies found gains in employment, two were inconclusive, and 
one study found no gain. Most of the studies reporting employment gains lacked 
comparison or control groups and most were based on self-report data. Lack of a 
control group is a serious limitation, since employment is highly susceptible to 
fluctuations in the economy, and, without a control group, it is difficult to infer 
whether gains were caused by participation in adult literacy or by economic 
factors. Furthermore, in some cases findings may have been influenced by the 
bias inherent in self-report. 

Two studies, the Washington Workforce Training Study (5S) and the 
Texas JOBS Program Evaluation (15W), did use matched comparison groups, 
however. Both used hard economic data on employment rather than self-report 
and both were of high technical quality. For former adult literacy education 
participants who enrolled in order to gain employment, the Washington study 
found negative gains in employment in contrast with the comparison group. In 
the Texas JOBS Program Evaluation, for which the subjects were JOBS 
participants and AFDC recipients, employment benefits for those who 
participated in adult literacy education were found. 

Conclusion: In general, it is likely that participants in adult literacy receive gains 
in employment. 
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Better Jobs for Those Who Are Employed 

Of the five studies that measured job improvement, four reported that 
participation in adult literacy education helped participants gain better jobs and 
one study reported no gains in job improvement. Although one of these studies 
conducted in Ohio (12S) included a comparison group, in all cases the data were 
based on self-report. In a national study (3N), substantial increases in pay were 
found for those who were employed at the time of initial data collection, although 
more than half the respondents believed participation had been of no help in the 
acquisition of their increased remuneration. The one study that found no job 
improvement (20WO) reported that neither participants in workplace education 
nor local partners (employers, teachers, union representatives) believed that 
participation had resulted in promotion. 

Conclusion: In general, participants in adult literacy education believe their jobs 
improve over time. However, there is insufficient evidence to conclude that 
participation in adult literacy education causes job improvement. 

Increased Earnings 

Of the six studies that measured impact on earnings, five found gains and 
one was inconclusive primarily because of its methodological limitations. Two of 
the studies reporting gains — ^the Washington Workforce Study (5S) and the and 
Texas JOBS evaluation (15W) — included matched comparison groups, used hard 
economic data to measure earnings, and were technically sophisticated. However, 
the Washington study included only subjects who participated in adult literacy to 
gain employment, and the gains in earnings found were very small. The Texas 
smdy was restricted to JOBS participants. 

Conclusion: In general, it is likely that participation in adult literacy education 
results in earnings gain. 



Continued Education 

All of the 10 studies that measured continued education found that 
participation in adult literacy had a positive impact on further education, although 
one study, the 1993 National Evaluation (3N), was somewhat ambiguous. It 
found that, while during initial data collection 60 percent of the respondents 
reported they might attend college and 70 percent said they might attend 
vocational school, 18 months later the percentage for college attendance had 
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declined to 37 and the percentage for vocational school had declined to 65. Five 
of the nine studies asked learners whether they planned to enroll in future 
education while three, which used a pre-post design, asked respondents whether 
they actually had enrolled. None of the studies included a control or comparison 
group. 

Conclusion: In general, adult literacy education has a positive influence on 
participants ' continued education. 

Termination/Reduction of Welfare Dependence 

Six studies show a reduction of dependence on welfare for adult literacy 
education participants, while two do not. Four studies included control or 
comparison groups. Of these, two showed no change in welfare status and three 
reported welfare reduction. Of the four studies with comparison groups, two were 
of high technical quality. Of these two, one, the Washington Workforce Study 
(5S), reported no reduction. The other, the Texas JOBS Evaluation (15W), did 
find welfare reduction over time. For one of the studies, the 1973 National 
Evaluation (3N), the reduction in welfare was very small. 

Conclusion: Although the evidence suggests that participants in welfare- 
sponsored (JOBS Program) adult literacy education do experience a reduction in 
welfare dependence, the evidence is inconclusive as to whether adult literacy 
education in general reduces welfare dependence for participants. 

Self-Reported Gains in Basic Skills 

The overwhelming majority of studies that asked participants whether they 
had improved in reading (9 studies), writing (8 studies), and math (9 studies) 
found that learners perceived gains in these subjects. The exception is one study 
that was rated inconclusive primarily for methodological reasons. 

Conclusion: Learners perceive that participation in adult literacy education 
improves their skills in reading, writing, and mathematics. 

Tested Basic Skills Learning Gain 

Using tests to measure basic skills gain presents many thorny issues 
including the appropriateness of tests used, test validity and reliability, attrition of 
subjects between pre- and posttesting, ceiling and floor effects, and the amount of 
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instruction received between test administrations. These issues are discussed in 
more detail in the case studies and later in this chapter. 

Another issue that plagues inferences on tested learning gain is the lack of 
a standard in adult literacy education regarding what should be considered an 
acceptable gain over a given duration of instruction. Without such standards, it is 
difficult to infer whether reported gains represent poor, good, or outstanding 
performance. 

Five studies found tested basic skills gains, two did not, and one study was 
ambiguous in its findings. For ABE learners, the National Evaluation of Adult 
Education Programs (NEAEP) (IN) found small gains on the TABE test after an 
average of 84 hours of instruction and small gains on the CASAS for ESOL 
students after 120 hours of instruction. However, as the case analysis notes, there 
were substantial methodological problems associated with these results. The 
NEAEP did not include a control or comparison group. The 1973 national 
evaluation (3N) also showed small gains on the TABE, but almost one fifth of 
those tested had had 39 or fewer hours of instruction. Like the NEAEP, the 1993 
national evaluation did not include a control or comparison group. 

Three studies did include control or comparison groups, and two of these, 
the California GAIN Study (14W) and the National Evaluation of the Even Start 
Program (22FL), were technically sophisticated. Using the TALS test, the 
California GAIN Study found no gain on the document and quantitative portions 
of the test after an average of 251 hours of instruction, although substantial gains 
were noted for one of the six counties studied in the evaluation. All subjects were 
welfare recipients and mandated participants in the California GAIN program. 

For a large sample that received pre- and posttesting on the CASAS test, the 
National Even Start Evaluation found small gains after about 70 hours of 
instruction. However, for a smaller sample that included a control group, no 
significant gains were found. 

Conclusion: As measured by tests, the evidence is insufficient to determine 
whether or not participants in adult literacy education gain in basic skills. 

GED Acquisition 

Four studies found impacts for GED acquisition and no study found lack 
of impact. Two studies were rated inconclusive because the marmer in which they 
measured GED acquisition made it difficult to infer the nature of impact. One 
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inconclusive study, an Ohio study (12S), reported that of the 62 percent of the 
respondents who entered ABE with the goal of obtaining a GED, 40 percent said 
they attained that goal. The other inconclusive study, a Wisconsin Study (13S), 
found that while 44 percent of the former participants indicated they had desired 
to obtain a GED, 27 percent had actually passed the GED tests. The California 
GAIN Study (14W) and the National Even Start Evaluation (22FL), both of which 
included comparison groups, found significant impact on GED acquisition. 

Conclusion: In general, adult literacy education provides gains in GED 
acquisition for participants entering at the adult secondary education (ASE) level. 

Self-image 

In the studies listed, self-image was variously defined as self-concept, self- 
esteem, and self-image. Of the 10 studies that included this variable, eight 
showed gains which were usually quite large, two were rated inconclusive 
primarily because of the way they reported their findings and no study found 
declines in self-image. With one exception, measures of self-image were based 
on self-report. That exception was the Tennessee Longitudinal Study (6S), which 
found small, but statistically significant, gains on the Rosenberg Self-Esteem 
Scale between initial data collection and follow-up. Yet despite these small 
measured gains, when the Tennessee subjects were asked whether their feelings 
about themselves had changed, 70 percent responded affirmatively. 

Conclusion: Participation in adult literacy education has a positive impact on 
learners ' self-image. 



Impact on Children’s Education 

The studies reviewed employed several variables to measure impact on 
children’s education, including the extent to which participants helped their 
children with homework, attendance at PTA meetings, and parent-teacher 
interaction. Some studies used one variable; others used multiple variables. The 
assessment of impact, therefore, represents a summation of the findings for each 
study listed. 

Eight studies found that participation in adult literacy had positive effects 
on children’s education, two were inconclusive, and one was negative. Of the two 
that were inconclusive, one was so primarily for methodological reasons and the 
other, an Ohio (12S) study that included a comparison group, demonstrated mixed 
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results. Although former participants scored higher than the comparison group on 
attending parents’ meetings, the comparison group scored higher on 
communication with the schools. Differences between the groups on helping 
children with homework were not statistically significant. All studies were based 
on respondent self-report except the one study — the National Even Start 
Evaluation — that showed negative results. 

The National Even Start Evaluation found no significant effects with 
respect to reading and writing in the home after 18 months for families where 
parents participated in adult literacy education. Neither did the study find a 
relationship between children’s scores on the Peabody Picture Vocabulary Test 
(PPVT) and parent’s participation in adult literacy education. 

Conclusion: According to learners ’ self-reports, participation in adult literacy 
education has a positive impact on parents ’ involvement in their children 's 
education. 



Attainment of Personal Goals 

Measuring attainment of learners’ personal goals raises two interesting and 
important conceptual issues. First, if participation in adult literacy education 
should result in new self-understanding and increased aspirations that lead 
learners to change their goals during the course of instruction, measuring the 
extent to which initial goals were achieved in a post assessment formal is 
irrelevant. 

Second, it is very difficult to measure personal goal attainment with an 
experimental or matched comparison group design because non-participants who 
constitute control or matched comparison groups do not have goals for adult 
literacy education. Consequently, every study that measured personal goal 
attainment either collected their data from program participants or compared the 
goal attainment of participants to former learners who exited the program. 

Former students included dropouts as well as successful completers. 

Yet participation in adult education is generally voluntary. Therefore, 
positive reports of goal attainment may be artificially inflated by the tendency of 
those whose goals were not being met to drop out prior to data collection. 
Similarly, and for the same reason, participants may be artificially more likely to 
report goal attainment than former learners, if former learner groups include those 
who dropped out for lack of goal attainment. 
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Of the seven studies that measured learners’ personal goal attainment, six 
found positive impact and one found little impact. The one study that foimd little 
impact (21 WO) measured goal attainment as the difference between what subjects 
said their goals were at program entry and the goals they reported they achieved at 
follow-up. Thus missing data and changes in goals were recorded as non- 
attainment. 

The 1980 National Evaluation (2N) asked respondents to indicate whether 
the goals they had set for themselves upon their entry into the program had been 
met. Only 17 percent reported their goals had not been at least partially met. 

Conclusion: Learners perceive that their personal goals are achieved through 
participation in adult literacy education. 

CONCEPTUAL, DESIGN, AND METHODOLOGICAL PROBLEMS 

With few exceptions, the studies analyzed for this report were flawed in 
ways that severely compromised the validity and utility of their findings. At best, 
public funds have been wasted. At worst, important planning and policy 
decisions have been made on inaccurate and misleading data. There are at least 
six causes of the flaws that are inherent in the studies reviewed for this report: 
inaccurate or incomplete data; over-reliance on self-report data; lack of adequate 
controls; lack of valid, reliable, and appropriate tests; poor quality research 
reports; and lack of relevant standards. Clearly, these problems must be avoided 
in future outcome and impact research if useful knowledge is to result. 

Inaccurate and Incomplete Data 

An overwhelming majority of the studies, including all those conducted at 
the national and state levels, collected learner data through adult literacy 
education programs. Some, most notably the NEAEP (IN) and the Tennessee 
Longitudinal Study (6S), were open and honest about the data collection problems 
they experienced. Others were either less candid or simply assumed their data 
were accurate. Common problems with data collected through programs were: 
inaccurate learner records, failure to pre- and posttest at specified intervals, 
administration of inappropriate levels of tests, failure to test, high attrition of 
subjects between pre and post data collection, programs’ withdrawal from the 
study before data collection was complete, and failure to forward data to 
researchers in a timely fashion. These problems were most severe for large-scale 
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outcome assessments like the NEAEP that required complete data in order to 
maintain their protocols for insuring generalized findings. 

In designing data collection protocols, researchers must realize that the 
programmatic context of adult literacy education differs substantially from the 
more traditional evaluation venues of elementary, secondary, and higher 
education. Because of open enrollments, it is difficult to pretest learners before 
they have engaged in instruction, and because of sporadic attendance, fear that 
testing may alienate learners, and high dropout rates, it is difficult to 
systematically posttest adequate proportions of those pretested. Many programs 
lack staff experienced in testing and data collection, and, as a result, inappropriate 
levels of tests are administered and data collection protocols are violated. 

Post-data collection at realistic intervals is a major problem. If actual 
hours of instruction are employed as the interval criterion, an accurate record of 
instructional hours received is necessary, and research subjects must be available 
when posttests are due. Sporadic attendance and dropout render this precision 
problematic. If larger units of elapsed time, such as weeks, months, or years, are 
used as the criterion, there is the problem that the actual hours of instruction 
learners receive per week or month can vary significantly. Except for GED 
completion, there typically is no concrete marker for successful completion of 
adult literacy instruction. Thus it is difficult to assess whether learners who have 
exited programs are successful completers or program dropouts. Moreover, 
because adult literacy learners tend to be a geographically mobile population, 
when collecting data it is extremely difficult to locate those who have terminated 
programs. 



Self-report Data 

Most studies relied on self-report rather than objective data for their 
findings. Lacking pre-data, studies that used a post-only design had little choice. 
For all studies, collecting self-report data was the obvious option given that hard 
data on such variables as employment, earnings, welfare reduction, and continued 
education were simply not available. The exceptions were the welfare and 
workforce studies that were able to obtain objective data from state welfare and 
employment records. 

The validity of self-report data is a matter of concern for researchers. 
When at post-data collection learners are asked to remember feelings or events 
they experienced at program entry, the accuracy of recall is an issue, especially 
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after periods of long duration. More important, there is a normal tendency for 
respondents to inflate the value of experiences, like adult literacy education, that 
have high social value and which entail significant sacrifices on the respondents’ 
part. By and large, self-reported gains were higher for the studies reviewed here 
than were gains measured by objective data. For example, when learners were 
asked to report their gains in basic skills, positive attribution of gain far 
outweighed negative attribution. However, when gains in basic skills were 
measured by tests, the gains found were small or, as with two studies, non- 
existent. Similarly, when the Tennessee Longitudinal Study (6S) measured gains 
in self-esteem with the Rosenberg Self-Esteem Scale, the gains were very small. 
Yet the self-reported gains were very high. Of course, it is quite possible that the 
tests used were not entirely appropriate or were not sensitive enough to measure 
the gains learners recognized in themselves. 

Self-reported gains in employment tended to be moderate. For the 1973 
National Evaluation (3N) they were about 10 percent after 18 months and for the 
New Jersey Study (8S) the overall gain in employment was 13 percent. However, 
when gains in employment were measured using actual employment records, they 
were small in the case of the Texas JOBS Study (15W) and non-existent in the 
case of the Washington Workforce Study (5S). 

Lack of Controls 

Many of the variables used in impact studies are influenced significantly 
by context and time. The state of the economy affects employment. Over time, 
earnings normally increase as job tenure increases, and dependence on APDC 
declines as welfare recipients’ children grow older. How, then, is it possible to 
infer that gains are caused by adult literacy education rather than by unknown 
contextual or time-related factors? 

The most logically defensible way is through the use of an experimental 
design in which subjects are randomly assigned either to instruction or a control 
group. Because random assignment insures that participant and control groups are 
equivalent in all respects except participation, differences in group pre- and post- 
gains can be attributed solely to the impact of adult literacy education. 

Although an experimental design is ideal when inferring causality is a 
goal, of all the studies reviewed, only the California GAIN Study (14W) and the 
National Even Start Evaluation (22FL) were able to employ one. As a result, the 
extent to which adult literacy education causes impact is a very difficult question 
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to answer. In the real world, experimental designs are rare because they are 
extremely difficult and expensive to employ. Random assignment is frequently 
impossible, especially on a large scale, and it is often difficult to collect data from 
control group members unless they are substantially paid. Moreover, extensive 
attrition from participant groups biases results, and attrition is a common 
occurrence in adult literacy education. Experimental designs have their 
limitations. Although they do permit researchers to infer causality, they generally 
do not explain how participation produces the positive effects found. 

If a true control group cannot be employed, use of a comparison group is a 
second-best option. However, when comparison groups are used it is vital that the 
comparison be appropriate. Two studies (5S, 15W) used matched comparison 
groups in which, through complex procedures, a participant group was compared 
to a non-participant group that was similar with respect to potent background and 
demographic variables. Although matched comparison groups provide a useful 
unit of comparison, they are not equivalent to participant groups in all respects. 
Indeed, by virtue of volunteering to participate in adult literacy, an act that is 
protracted and arduous, participants demonstrate a certain level of motivation that 
does not necessarily apply to comparison group members. Was it participation in 
adult literacy that caused the social and economic impacts noted in Washington 
(5S) and Texas (15W) or was it high personal motivation? The answer is not 
known. 



Two studies used comparison groups that were inappropriate. One (lOS) 
used former participants for the comparison, but, because it could not be 
determined to what extent this group was composed of successful completers as 
opposed to dropouts, the comparison was flawed. Another study (4N) employed 
program dropouts as a comparison. However, this comparison too was 
inappropriate because the participant and comparison groups may well have 
differed with respect to their motivation and/or ability to participate. 

Valid and Reliable Tests 

The most commonly used tests for basic skills gain were the TABE and 
the CASAS tests. Use of these tests, or any other for that matter, raises important 
issues. The first is the appropriateness of the test for the instruction given. The 
TABE, for example, is a standardized, general measure of basic reading skills. 
Although this test may be appropriate for programs that focus on basic skills 
development, it may be inappropriate for programs that focus on contextualized 
literacy, such as workplace literacy where the content of instruction is presumably 
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job-specific. Nevertheless, one of the few workplace literacy studies that 
systematically measured tested learning gain (21 WO) used the TABE. 

The contradiction that learners generally report large gains in basic skills 
while tests show much smaller or no gains raises the possibility that the tests 
available are not sensitive enough to measure important gains perceived by 
learners. Although the validity and reliability of tests such as the CASAS, TABE, 
and TALS have been established, for expediency, many studies used only portions 
of tests, and for portions of tests, validity and reliability are serious issues. 

Many tests, the TABE for example, are calibrated into levels. If an 
improper level of the test is administered, erroneous and misleading results can 
occur. For example, if a level of the test that is too difficult for the population 
tested is given, the “bottomed-out” scores that result can only go up. Thus the 
chance factor, which is inherent in all tests, works in favor of increased scores at 
posttest, and artificial gains result. If the test level is too easy, artificial declines 
will result. These “ceiling and floor” effects may have distorted findings for 
several of the studies reviewed (IN, 3N, 14W, 22FL). 

Test administration affects test validity. For timed tests, time 
specifications must be followed. Tests must be administered at appropriate time 
intervals. If the interval is too short for reasonable gains to be expected, 
artificially low gains result, and if the interval is too long, learner attrition often 
takes an unacceptably high toll. 

Poor Quality Reports 

Reports of outcome and impact research must contain certain information 
about design and methodology so that researchers and policy makers can assess 
the credibility of the research. This critical information includes basic research 
design, sampling procedures, data collection procedures, response rates, test 
validity and reliability, and time intervals between pre- and posttesting. Of the 
1 15 studies reviewed for this report, about 40 percent lacked this vital information 
and had to be eliminated from further consideration. Since the credibility of their 
findings could not be assessed, they were essentially worthless. 

Lack of Standards 

There are two equally important processes involved in credible outcome 
and impact research. The first requires competent design, execution, and 
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reporting so that valid findings result, while the second entails judgment regarding 
whether findings represent program success. There are at least two bases for 
judgment and both are problematic in adult literacy education. The first is 
comparative judgment in which the gains documented by one study are compared 
to the gains of similar studies. However, because the outcome and impact studies 
of adult literacy education vary greatly in design, procedures, and populations, 
meaningful comparative judgment is confounded. The second is normative 
judgment. In this case findings are assessed in relation to established standards. 
However, for adult literacy education, such standards do not yet exist. 

This problem is particularly evident in tested learning gain. The problems 
and issues of testing not withstanding, how much tested gain should be expected 
after a given period of instruction? Certainly any gain is not sufficient, for very 
small gains can hardly be indicative of program success. Because the 23 ‘'most 
credible” studies analyzed here do not yield a comparative standard, and because 
there are no normative standards for tested learning gain, we simply do not know. 

IMPLICATIONS FOR POLICY 
Finding the Answer 

The essential task for any outcome and impact study is to assess 
programmatic effectiveness. Yet as the case studies attest, flaws in siud\’ design 
and execution have severely impeded the ability of most studies to make such an 
assessment. None of the three national evaluations of the Adult Education Act- 
sponsored program were able to do so convincingly. To our knowledge, after 
more than 25 years, only nine credible state studies have been conducted, and all 
are limited in one way or another. 

Although this study has attempted to make reasoned conclusions about 
program effectiveness from the evidence embodied in the case studies taken 
together, the data are not sufficient to settle the question of effectiveness at a level 
of certitude the public deserves. Furthermore, a significant amount of public 
funds have been wasted on individual studies that were unable to yield valid 
conclusions. More and better outcome and impact research is needed. 

Economic Impact 

Increasingly, adult literacy education is being held accountable for its 
economic outcomes and impacts, and the news from this analysis is basically 

121 




125 



NCSALL Reports #6 



January 1999 



good. The evidence presented here suggests that, at least in the short-term, adult 
literacy education does produce employment-related benefits, although the extent 
to which the jobs acquired are good jobs is still a question. Likewise, the 
evidence suggests that adult literacy education has a positive short-term impact on 
earnings. Short-term gains are but the tip of the iceberg, however, and long-term 
gains need to be investigated. 

It is possible, for example, that there are intergenerational effects to 
participation, as would be the case if increased earnings were invested in the 
health and education of children. If, 20 to 30 years after completion of adult 
literacy education, participants’ children were shown to have higher levels of 
education than the children of non-participants, an important social benefit would 
have been demonstrated. Similarly, the long-term cumulative effects of adult 
literacy need to be studied. It may well be that the power of adult literacy 
education lies not in its function as an end that produces immediate gains but in 
its function as an enabling means to a wide range of other benefits that, when 
obtained, yield still more benefits. A hypothetical case in point would be 
successful learners who go on to further education, subsequently obtain high-level 
employment, and end up increasing their incomes substantially. Such cumulative 
gains would not even begin to accrue until five or more years after completion of 
adult literacy education. 



Basic Skills 

The findings of this study suggest that while learners generally perceive 
they have gained in basic skills as a consequence of participation, the evidence 
from testing is insufficient to infer gain. The implications for policy depend 
somewhat on one’s perspective. According to one perspective, if learners 
perceive they gain in basic skills, achieve economic benefits, and meet their 
personal goals, the adult literacy education program is functioning effectively and 
tested learning gain need not be a major point of concern. Our findings support 
program effectiveness according to this perspective. An alternative perspective, 
however, is that adult literacy is first and foremost an educational program that 
focuses on reading, writing, and computation. It should, therefore, be held 
accountable primarily for how much its clients learn in these areas, and, since they 
are an objective measure, tests are an important and valid measure of impact. 
According to this perspective, there is substantial room for improvement with 
respect to basic skills. 
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Clearly, what is meant by basic skills is an important policy consideration. 
If literacy is defined as a set generalized reading, writing, and computational skills 
akin to what is taught in elementary education, then standardized test results are a 
relevant criterion. If, however, literacy is defined as the foundation abilities 
learners need in order to function in specific contexts — abilities that are context- 
specific and more wide-ranging than “academic” reading, writing, and 
mathematics — then standardized tests are anathema and learners’ perceptions of 
their basic skills gains may be the most acceptable criterion for measurement. 

Self-image 

Of all the evidence presented in this study, the evidence that adult literacy 
education produces gains in positive self-image (and similar constructs such as 
self-confidence and self-esteem) is the strongest. From a policy perspective, this 
is somewhat perplexing, since feeling better about one’s self is not a stated goal of 
the federal adult literacy program. For this reason, perhaps, the question is not 
whether learners’ self-image improves with participation, but whether improved 
self-image is a lasting effect that promotes impact in the human capital domain 
and reduces social dependence. On the one hand, it could be that reported gains 
in self-image are no more than the short-term elation people normally experience 
when they have completed difficult and protracted tasks. On the other, the effects 
might be lasting. Although lasting gain has the potential for stimulating 
successful learners’ motivation to succeed in beneficial activities they otherwise 
might not have attempted, we do not know whether gains in self-image are 
lasting, and the relationship between enhanced self-image and the attainment of 
other benefits needs to be established. 

Children’s Education 

In the studies reviewed, impacts on children’s education were generally 
measured by such variables as the extent to which participants read to their 
children, whether they attended PTA meetings, and the frequency with which they 
interacted with children’s teachers. Based on respondents’ self-reports and these 
measures, it was concluded that adult literacy education has a positive impact on 
children’s education. These variables, however, are essentially surrogates for the 
long-term effects on children that would be expected if genuine impact were to be 
demonstrated, effects that might include children’s more positive attitudes toward 
education, improved school performance, higher secondary school graduation 
rates, and increased enrollment in higher education. These long-term effects need 
to be established. 
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Personal Goal Attainment 

As Beder (1991) notes, for many adult literacy education practitioners, that 
learners obtain their personal goals is the most important objective of instruction. 
There is a practical as well as philosophical reasons for this. If voluntary learners 
cannot attain their personal goals, there is no reason for them to attend. On the 
other hand, policy makers have increasingly stressed the human capital outcomes 
of adult literacy education, outcomes such as increased employment and income. 
Those who focus on learner goal attainment note that many learners are not 
motivated by attainment of jobs and income, and there is evidence to support this 
assertion (Beder & Valentine, 1987; Mezirow, Darkenwald, & Knox, 1975; 
Washington State Training and Education Coordinating Board, 1997). Those who 
stress economic outcomes, however, consider adult literacy to be a workforce 
training program that must be coordinated with job training and welfare policy. 

Given this difference in program expectations, a critical issue is the extent 
to which the two perspectives are compatible. Can the adult literacy education 
program achieve human capital objectives while still meeting learners’ personal 
goals? The conclusions of this study, which suggest impact in both arenas, 
provide evidence in the affirmative. 

Accountability, Standards, and Judgments 

Much has been said about the need for relevant standards on which to base 
evaluative judgments of program impact. As noted earlier, the problem is that 
standards derive from goals and goals derive, at least in part, from conceptions 
regarding the definition and purposes of adult literacy education. On purposes 
and definitions there is no consensus. Clearly, to recommend that consensus be 
achieved is futile, like flying a kite in a tornado. What then should be done? One 
policy alternative is maintenance of the status quo and the resulting confusion 
over goals, which has characterized the conduct of adult literacy education since 
the inception of the federal program. This situation has resulted in a pluralistic 
system that tolerates ambiguity on the one hand but defies accountability on the 
other. Another alternative would be the imposition of uniform national standards 
and measures through legislation or bureaucratic fiat. Although outcome 
expectations would be clear, the goals currently served by adult literacy education 
would be necessarily narrowed, most likely in the direction of human capital, and 
learners who do not have goals that are sanctioned by the system might be under- 
served. A third alternative is voluntary compliance. In this model, programs 
would be able to choose from several models and corresponding standards and 
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measures. The goals of the National Institute for Literacy’s (NIFL) Equipped For 
the Future (EFF) program (Stein, 1995) are one alternative. These goals are 
broad-based and comprehensive and have been generated through systematic 
qualitative research. The development of EFF performance standards is imder 
way. Models and indicators might be developed for basic literacy skills 
approaches, workplace literacy, and so on. The issue is whether a voluntary 
system of outcome accountability would satisfy the current pressures for national 
accountability evident in Congress and state legislatures. 

RECOMMENDATIONS 

If the outcomes and impacts of adult literacy education are to be known 
and accountability is to be demonstrated, a three-part research strategy is 
recommended. The first is the implementation of a national outcome and impact 
reporting system that would provide useful outcome information on an ongoing 
basis. This system should be designed to produce data that is useful for planning 
as well as for accoimtability. The second is a comprehensive national longitudinal 
evaluation that would measure long-term impact at a level of certainty that would 
be difficult to achieve in a national reporting system. The national assessment 
should include experimental design and qualitative components. The third is 
systematic funding and improvement of state and local outcome studies needed to 
supplement a national reporting system and a major longitudinal study. 

An Effective Outcome and Impact Reporting System 

Although, as this report demonstrates, there is a substantial research 
literature that describes the inputs, processes, and outputs of adult literacy 
education, the research on program outcomes and impacts has been relatively 
meager and seriously flawed. As a result, it has been impossible to satisfy 
adequately the public demand for accountability and the need for valid planning 
information. How can this situation be changed? 

Addressing this question, in 1996 the state directors of adult education 
recommended the development of a national outcome reporting system for adult 
literacy education, and a planning study was commissioned by the Office of 
Vocational and Adult Education (OVAE) and awarded to Pelavin Associates. In 
the report of this study, Condelli and Kutner (1997) discuss four models for 
measuring outcomes: comprehensive reporting, the follow-up survey approach, 
the data sharing/workforce model, and the menu approach. 
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Comprehensive reporting would entail the expansion and refinement of the 
current federal reporting system. This system relies on data collected from local 
programs that are forwarded to the federal government through the states. The 
accuracy of the current federal reporting system has been questioned (United 
States General Accounting Office, 1995), and if this strategy were followed, the 
current reporting system would have to be improved substantially. In the follow- 
up survey method, comprehensive reporting would be augmented by periodic 
follow-up surveys conducted by the states. In the data sharing/workforce model, 
client outcome data from job training, welfare, adult literacy, and other 
workforce-oriented programs would be pooled into a common database focusing 
on workforce performance. In the menu approach, programs would be able to 
choose the measures that were relevant to them from an established list. Each of 
these approaches has advantages and disadvantages that must be weighed before 
implementation, and, according to the Pelavin report, a process has been 
established for doing so. 

Condelli and Kutner (1997) discuss several design issues that must be 
resolved if any of these models is to be effective. 

• Uniform operational definitions of “program participant,” participant’s 
instructional level and outcome and impact variables must be established. 

• Data collection intervals must be established and uniformly implemented. 

• A viable sampling methodology must be designed and implemented. 

• Protocols have to be established for reporting and dealing with missing data. 

Although the findings of this research certainly corroborate a need to resolve 
Condelli and Kutner’ s design issues, other equally important issues must be 
resolved if an adequate outcome and impact reporting system is to result. 

Stated bluntly, under current conditions local programs lack the capacity 
to collect accurate and timely data. The problem is contextual. Open enrollment, 
part-time staff, sporadic attendance, high attrition rates, and learners’ reluctance to 
be tested confound systematic data collection. These problems are exacerbated by 
a weak programmatic infrastructure (Beder, 1996) produced by high staff 
turnover, reliance on part-time staffing, and low budgets. 

If valid and reliable data are to be collected from programs, experience 
suggests that at least three conditions must be met. 1 . Data collectors must be 
thoroughly trained and adequately compensated; 2. Data collection activities must 



126 



O 

ERIC 



130 



NCSALL Reports #6 



January 1999 



be carefully monitored; and 3. Data collection protocols must be rigorously 
followed. This includes collecting data from absent learners and those who have 
terminated the program. 

The reporting system would have to be standardized, carefully 
administered, and closely monitored. 

• Standardization: If an objective of the reporting system is to permit 
comparisons between states and programs on performance variables studied, 
the items that measure these variables and the data collection procedures 
would have to be standardized. 

• Administration and monitoring: Standardization would be meaningless unless 
procedures were established and implemented to insure that standardized 
protocols were followed. Establishing procedures that were both valid and 
reliable and within the capacity of state and local programs to implement 
would be a challenge. Substantial funding and staff development would be 
required. 



A National Longitudinal Assessment 

Whichever model suggested by Condelli and Kutner is eventually 
employed, the task of establishing an accurate reporting system for adult literacy 
education will be daunting — so much so that even the most optimistic would 
foresee problems in implementation that will, at least in the short term, limit the 
credibility of the data that results. To provide the most credible data possible, a 
national assessment of adult literacy education outcomes and impacts that will 
address and resolve the problems and limitations of the three previous national 
evaluations is needed. 

• If there is “pay-dirt” in our understanding of outcomes and impacts it probably 
lies in establishing the long-term intergenerational and cumulative effects of 
adult literacy education. The best way to do this would be through a 
longitudinal evaluation in which the same subjects were followed up for a 
period of no less than five years. Such a study is currently being planned by 
the National Center for the Study of Adult Learning and Literacy (NCSALL). 

• It is absolutely critical that such an assessment be carefully designed, and this 
task should be assigned to a highly knowledgeable and experienced design 
team. Because methodological issues are a major concern, the team should 
include technical experts familiar with the large-scale evaluation of social 



127 



O 

ERIC 



131 



NCSALL Reports #6 



January 1999 



service programs. Experience tells us, however, that while inadequate 
methodology has been a problem in outcome and impact research, many 
assessments have failed primarily for logistical reasons. For this reason, the 
team should also include researchers and practitioners who are thoroughly 
familiar with the operational context of adult literacy education. The design 
team should report prior to the award of a contract to conduct the research, and 
its recommendations should be incorporated into contract specifications. 

• Although it is difficult to fathom a national longitudinal evaluation that 
employs an experimental design, it may well be possible to conduct controlled 
studies on a limited number of programs as one component of a national 
effort. 

• Although there have been several excellent qualitative evaluations of adult 
literacy education (Fingeret, 1985; Fingeret & Danin 1991), we were unable to 
identify qualitative outcome and impact assessments which met our definition 
of outcome and impact. If included as a component of a national evaluation, a 
qualitative assessment might answer critical questions about impact that are 
difficult or impossible to answer quantitatively: What is the meaning of 
impact from the perspective of successful learners? Are there important 
impacts of adult literacy that learners recognize in themselves, but are not 
amenable to quantitative measurement? How and to what extent do increased 
self-confidence and self-efficacy enable other positive changes in successful 
learners’ lives? These are just a few of the possible questions. 

Improved State and Local Outcome Studies 

Even if an effective national reporting system and a national longitudinal 
study were to be implemented, it would be necessary to conduct specific outcome 
assessments at the state and local levels. Such studies would be needed when 
states wished the depth and detail of analysis necessary to make complex policy 
decisions. They would also be necessary to assess the outcomes of special 
literacy initiatives and idiosyncratic programs such as programs for the 
handicapped. 

To improve state and local studies it is important that the methodological 
and design issues previously outlined in this report be resolved when studies are 
planned and conducted. Resolving these issues requires both technical skill and 
experience with the adult literacy context on the researcher’s part. For this reason, 
the competency of the researcher must be a key consideration when studies are 
commissioned. Competent studies require adequate funding. Indeed, spending 
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less than what a competent study requires results in worthless data and money 
wasted. 



The most important ingredient in generating excellent outcome research is 
thoroughness and rigor. In most of the outcome studies reviewed here, the 
research was designed, the data were gathered and analyzed, the findings were 
reported, and the investigator was paid. We know something about effectiveness, 
perhaps, but nothing more, and we are left with the suspicion that the research was 
flawed in ways that were not revealed. In a thorough study, once initial findings 
have been generated the researcher explores all possible interpretations of 
causality. Are the findings really the result of participation in adult literacy or are 
they quirks of the methodology? Alternative interpretations of the data are 
explained and limitations are fully accounted for. Are the findings driven or 
shaped by factors that can be identified by secondary analysis? If so, secondary 
analysis is conducted to add to and clarify the understanding produced by the 
research. Design and methodology are explained in detail. 

Thorough and rigorous outcome studies can and should contribute to our 
theoretical and conceptual knowledge of adult literacy education. In this way they 
can expand knowledge about adult literacy that goes beyond program impact. A 
case in point is that, despite all the outcome research in adult literacy education 
conducted thus far, we are left with a contradiction: Learners perceive that they 
gain substantially in basic skills through participation while data from tests 
conflict to the extent that it is difficult to conclude whether or not learners gain. Is 
this because learners recognize important gains in themselves that tests do not 
measure, or is it because gaining literacy skills is socially desirable and learners 
inflate their self-reports? If learners are being honest, perhaps the major gains in 
literacy are indeed contextual and personal. If the tests are right, perhaps the 
quality of instruction and staff development needs to be seriously examined. 

Either way one cuts it, the answer has important implications for policy and 
practice, and the answer is researchable within the parameters of outcome and 
impact research. 
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